samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-11 05:18:09 +03:00

Author	SHA1	Message	Date
Martin Schwenke	028fe930b6	ctdb-recoverd: Fix backward compatibility for CTDB_SRVID_TAKEOVER_RUN When running a mixed version cluster, compatibility with older versions was was broken during recent refactorisation. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Amitay Isaacs	41d37058ca	tunables: Remove obsolete tunables Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ca5fc3431573c44d55d09d987c715fb53756fc1f)	2013-10-30 15:37:11 +11:00
Amitay Isaacs	7eb680a95f	build: Move the default CTDB socket from /tmp to /var/run/ctdb Use /var/run/ctdb/ctdbd.socket because there might be other daemons that need sockets in the future. The local daemons test code to create a link for the default convenience socket has to be removed because the link can't be created as a regular user in the new location. This should be OK since all calls to the ctdb tool in the test code should be wrapped in onnode. When debugging tests, a developer will have to set CTDB_SOCKET by hand. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-programmed-with: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dc67a4e24af9d07aead2a1710eeaf5d6cc409201)	2013-10-25 12:06:07 +11:00
Martin Schwenke	b595712f25	ctdbd: Simplify database directory setting logic No need to check if the options are set. The options are always set via static defaults. No need to talloc_strdup() the values via wrapper functions. The options aren't going away. Remove now unused ctdb_set_tdb_dir() and similar functions. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1fe82f3d7b610547ff4945887f15dd6c5798a49b)	2013-10-25 12:06:06 +11:00
Martin Schwenke	bd73e017b0	common: New function ctdb_mkdir_p_or_die() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7b971df79b0b63f83555205eacf48d49ca3a273a)	2013-10-25 12:06:06 +11:00
Martin Schwenke	c07e3830b3	common: New function mkdir_p() Behaves like mkdir -p. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit afe2145d91725daf1399f0a24f1cddcf65f0ec31)	2013-10-25 12:06:06 +11:00
Michael Adam	49fcfd2cb3	ctdb_client.h: fix build on AIX by removing C++-style comments Reported by John P Janosik <jpjanosi@us.ibm.com> Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 1f327401f2e181780937aa3f6c479376ff787f3f)	2013-10-23 00:53:56 +02:00
Martin Schwenke	e782b61732	ctdbd: Pass the public address file location in ctdb context No need to pass it as an extra argument to ctdb_start_daemon. Also ensure options.public_address_list gets a nice static default. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a3d63a9db89d08bb284b3b3a6db773422f21b477)	2013-10-22 15:37:54 +11:00
Martin Schwenke	4adc8f4f09	ctdbd: Default for event_script_dir should use CTDB_BASE Also get rid of ctdb_set_event_script_dir(). It creates an unnecessary copy of something that will be around for the lifetime of the process. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 21b4d1aba00902f1eee0cbf4f082b0794fd5b738)	2013-10-22 15:37:54 +11:00
Martin Schwenke	f9ce563135	ctdbd: Add nodes_file member to struct ctdb_context This allows ctdb_load_nodes_file() to move to ctdb_server.c and ctdb_set_nlist() to become static. Setting ctdb->nodes_file needs to be done early, before the nodes file is loaded. It is now set from CTDB_BASE instead ETCDIR, so setting CTDB_BASE also needs to be done earlier. Unhack ctdbd_test.c - it no longer needs to define ctdb_load_nodes_file(). Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 20e705e63bd3b20837cc3ac92fdcf2a9650ccfc8)	2013-10-22 15:37:54 +11:00
Amitay Isaacs	c5ec04f24e	client: Reimplement persistent transaction code using TRANS3_COMMIT Implementing persistent trasnaction code from Samba. Persistent transaction code was reimplemented in Samba using g_lock.tdb to hold transaction locks and using TRANS3_COMMIT control. Implementation details: 1. When starting a transaction, create a record with "transaction-<dbid>" as key and store current server_id in the structure. 2. If a record already exists, some other client has already started a transaction. Verify that the process corresponding to server_id stored in the record really exists or it's a stale record and overwrite it. 3. All modifications to the actual persistent database are stored in a marshal buffer. 4. When transaction is committed, read the sequence number of the persistent database and increment it. Sequence number record is also stored in the marshal buffer. 5. Send the changed records (marshal buffer) in TRANS3_COMMIT control to all the active nodes. 6. If all controls succeed, verify that the sequence number has been incremented. Commit is successful. If any of the controls fail, abort the transaction. 7. In case sequence number has not yet been incremented, then database recovery has been triggered. So repeat from step 5. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4e0f1971792c9431d8d51dc57d54ecc9e4576dd5)	2013-10-04 15:46:15 +10:00
Amitay Isaacs	be33efa3e4	ctdbd: Remove transaction code related to TRANS2 commits This removes data types and structure elements related to TRANS2 persistent transaction code. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 22a253b7ccf1ff854cddf0b67969dc84d7d6a654)	2013-10-04 15:20:25 +10:00
Amitay Isaacs	91d644325d	ctdbd: Deprecate TRANS2 commit controls Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7d176352986317e63696d74252ff5d8eccb2fee5)	2013-10-04 15:20:25 +10:00
Amitay Isaacs	fe62936bb6	include: Remove unused set_dmaster structure Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2ce3a48cc969d563c26dd295723416c0d7b077a2)	2013-10-04 15:20:25 +10:00
Amitay Isaacs	4ca9b96114	client: Add ctdb_ctrl_getdbseqnum() function Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8cb1fbbfe88327c9c7ab68e8eded586dff611e57)	2013-10-04 15:15:34 +10:00
Amitay Isaacs	5d47f28e15	client: Add ctdb_ctrl_getdbstatistics() function Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1e7fca5cdc1d7205cf084e35aace1a5dc46ea294)	2013-10-04 15:15:34 +10:00
Amitay Isaacs	105afa543e	client: Add ctdb_client_check_message_handlers() function Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c9a9d14c91f203ce964a426a8a1e2c1715af2098)	2013-10-04 15:15:34 +10:00
Martin Schwenke	b33ee7a2a4	recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9)	2013-09-19 12:54:31 +10:00
Martin Schwenke	1793412de2	recoverd: Remove unused CTDB_SRVID_RELOAD_ALL_IPS and handler Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4cd727439a0824ebb8dbcf737d9888ffc3c41184)	2013-09-19 12:54:31 +10:00
Martin Schwenke	5f0913d321	recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56)	2013-09-19 12:54:31 +10:00
Martin Schwenke	4c3f8dc3bb	recoverd: Make the SRVID request structure generic No need for a separate one for each SRVID. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9c22b04d5aa7938a3965bd3144568664eb772ce)	2013-09-19 12:54:30 +10:00
Martin Schwenke	fe7f66547b	client: Remove unused function list_of_active_nodes_except_pnn() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d8a76cf79f07dfb5a93c6c9a13f16e3268c7dd57)	2013-09-11 15:35:03 +10:00
Martin Schwenke	1ae731198a	recoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c This is an internal structure. It was moved into ctdb_private.h a long time ago to allow unit testing. Unit test compilation was changed shortly afterwards to make this unnecessary. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit db57261d7dc264e161659a8c547f44fbd9e88eeb)	2013-08-22 17:00:20 +10:00
Amitay Isaacs	1467b666f2	Revert "LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node" This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504. This is a premature optimization. Record can bounce between nodes very quickly if it is a contended record. There is no need to hold a record on a node unnecessarily. In case record contention becomes bad, enabling sticky records on a database is a better idea. Conflicts: include/ctdb_private.h server/ctdb_tunables.c Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ac417b0003f0116f116834ad2ac51482d25cfa0d)	2013-08-22 14:08:52 +10:00
Amitay Isaacs	de6b97ce4f	Revert "recoverd: Use correct tdb flags when creating missing databases" This reverts commit 10a057d8e15c8c18e540598a940d3548c731b0b4. This approach would not work when creating local databases since currently there is no control to receive TDB flags for remote databases. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ca61eb776ab862bd269e45ee0f9f96e7e1e0e001)	2013-08-14 14:15:33 +10:00
Amitay Isaacs	f15e1a28a7	recoverd: Use correct tdb flags when creating missing databases When creating missing databases either locally or remotely, make sure to use the correct tdb flags from other nodes. Without this, volatile databases can get attached without TDB_INCOMPATIBLE_HASH flag. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 10a057d8e15c8c18e540598a940d3548c731b0b4)	2013-08-01 11:08:25 +10:00
Amitay Isaacs	d8fc36781c	ctdbd: Remove incomplete ctdb_db_statistics_wire structure Instead of maintaining another structure, add an element as place holder for marshall buffer of hot keys. This avoids duplication of the structure. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e73b2e12adc9db1dedb48d32bba3a8406a80f4cd)	2013-07-29 16:00:46 +10:00
Amitay Isaacs	854216236b	Revert "ctdbd: Remove incomplete ctdb_db_statistics_wire structure" The structure cannot be removed without adding support for marshalling keys for hot records. This reverts commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 023ca2e84f5ed064a288526b9c2bc7e06674dd81)	2013-07-29 16:00:46 +10:00
Amitay Isaacs	500b26e48f	common/system: Add ctdb_set_process_name() function Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit fc3689c977f48d7988eed0654fb8e5ce4b8bfc8b)	2013-07-10 14:33:19 +10:00
Amitay Isaacs	d46c24f4d0	ctdbd: No need for DeadlockTimeout tunable The code for deadlock detection and killing smbd process causing deadlock has been removed and replaced with external debug script. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2211cd94bea266547d3e6f167d3160a6b23bec88)	2013-07-10 14:33:18 +10:00
Amitay Isaacs	d36aa928fd	ctdbd: Remove incomplete ctdb_db_statistics_wire structure Send the ctdb_db_statistics directly instead of first copying it to duplicate ctdb_db_statistics_wire structure. This simplifies the implementation of the control to get database statistics. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5)	2013-07-10 14:33:18 +10:00
Martin Schwenke	7290798a41	recoverd: Clean up log messages in remote IP verification The log messages in verify_remote_ip_allocation() are confusing because they don't include the PNN of the problem node, because it is not known in this function. Add the PNN of the node being verified as a function argument and then shuffle the log messages around to make them clearer. Also fold 3 nested if statements into just one. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f0942fa01cd422133fc9398f56b4855397d7bc86)	2013-07-05 15:52:33 +10:00
Martin Schwenke	dbd1759eae	util: New function ctdb_die() This is like ctdb_fatal() but exits cleanly without dumping core or generating a backtrace. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c0a9456692c88a7a5542cd893d8f326524d3f94e)	2013-07-05 15:52:33 +10:00
Amitay Isaacs	c6914e3891	banning: Make ctdb_local_node_got_banned() a void function When this function is called, we are already committed to banning and there is no point in failing this function. In case, freezing of databases fails, it will be fixed from recovery daemon. (This used to be ctdb commit bb178338658b4ae32382a1f62f7c21cee1d4878f)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	622ccd09f9	freeze: Make ctdb_start_freeze() a void function If this function fails due to memory errors, there is no way to recover. The best course of action is to abort. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 46efe7a886f8c4c56f19536adc98a73c22db906a)	2013-07-02 12:59:08 +10:00
Martin Schwenke	6a52a87028	ctdbd: Refactor shutdown sequence Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b32fd04bfbf33062d45365b37a7247e272a76ceb)	2013-06-22 15:51:02 +10:00
Martin Schwenke	6d9667f01c	ctdbd: Add new runstate CTDB_RUNSTATE_FIRST_RECOVERY This adds more serialisation to the startup, ensuring that the "startup" event runs after everything to do with the first recovery (including the "recovered" event). Given that it now takes longer to get to the "startup" state, the initscript needs to wait until ctdbd gets to "first_recovery". Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7)	2013-05-24 14:08:07 +10:00
Martin Schwenke	77671b9ef5	ctdbd: New control CTDB_CONTROL_GET_RUNSTATE Also new client function ctdb_ctrl_get_runstate(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dc4220e6f618cc688b3ca8e52bcb3eec6cb55bb1)	2013-05-24 14:08:07 +10:00
Martin Schwenke	63577c96db	ctdbd: Replace ctdb->done_startup with ctdb->runstate This allows states, including startup and shutdown states, to be clearly tracked. This doesn't include regular runtime "states", which are handled by node flags. Introduce new functions ctdb_set_runstate(), runstate_to_string() and runstate_from_string(). Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28)	2013-05-24 14:08:06 +10:00
Amitay Isaacs	1ddc7b0d10	locking: Remove functions that are not used anymore These functions were used in locking child process to do the locking. With locking helper, these are not required. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c660f33c3eaa1b4a2c4e951c1982979e57374ed4)	2013-05-24 09:06:40 +10:00
Martin Schwenke	54e91df60d	recoverd: Move IP flags into ctdb_takeover.c These should never be seen outside the IP allocation code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e143abd16ccde2e0edfe103673d31a5fb06b6aef)	2013-05-09 12:55:42 +10:00
Martin Schwenke	0445c988e2	recoverd: Fix tunable NoIPTakeoverOnDisabled, rename to NoIPHostOnAllDisabled This really needs to be per-node. The rename is because nodes with this tunable switched on should drop IPs if they become unhealthy (or disabled in some other way). * Add new flag NODE_FLAGS_NOIPHOST, only used in recovery daemon. * Enhance set_ipflags_internal() and set_ipflags() to setup NODE_FLAGS_NOIPHOST depending on setting of NoIPHostOnAllDisabled and/or whether nodes are disabled/inactive. * Replace can_node_servce_ip() with functions can_node_host_ip() and can_node_takeover_ip(). These functions are the only ones that need to look at NODE_FLAGS_NOIPTAKEOVER and NODE_FLAGS_NOIPHOST. They can make the decision without looking at any other flags due to previous setup. * Remove explicit flag checking in IP allocation functions (including unassign_unsuitable_ips()) and just call can_node_host_ip() and can_node_takeover_ip() as appropriate. * Update test code to handle CTDB_SET_NoIPHostOnAllDisabled. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1308a51f73f2e29ba4dbebb6111d9309a89732cc)	2013-05-07 16:20:46 +10:00
Martin Schwenke	fa16cccf02	ctdbd: Remove the "stopped" event It isn't used, superceded by "ipreallocated". Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2bb8596a8af6406ef50e53953884df9d6246a96)	2013-05-06 13:38:21 +10:00
Martin Schwenke	2e59cd5428	ctdbd: New control CTDB_CONTROL_IPREALLOCATED This is an alternative to using ctdb_run_eventscripts() that can be used when in recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 27a44685f0d7a88804b61a1542bb42adc8f88cb1)	2013-05-06 13:38:21 +10:00
Michael Adam	1aa09dd5c3	include: define CTDB_REC_RO_FLAGS - all read-only related record flags This is used for some checks Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c7924ce6404bb18641b00d5fbd2fe9da9aaf7959)	2013-04-24 18:48:31 +10:00
Michael Adam	527976d02a	vacuum: introduce the RECEIVE_RECORDS control This in preparation of turning the vacuming on the lmaster into into a two phase process: - First the node sends the list of records to be vacuumed to all other nodes with this new RECEIVE_RECORDS control. The remote nodes should store the lmaster's empty current copy. - Only those records that could be stored on all other nodes are processed further. They are send to all other nodes with the TRY_DELETE_RECORDS control as before for deletion. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e397702e271af38204fd99733bbeba7c1db3a999)	2013-04-24 18:47:32 +10:00
Martin Schwenke	7ba42d2c89	util: Removed unused declaration of ctdbd_start() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 1e989894764e4cd1d551c44784d91cb295cd790d)	2013-04-18 13:22:12 +10:00
Martin Schwenke	7ccde44d30	include: Move ctdb_start_daemon() from ctdb_client.h to ctdb_private.h It really is internal. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit abb64f62efaa70df4b87c030b96300eafd98e6a3)	2013-04-18 13:22:12 +10:00
Martin Schwenke	dcf1ac34ab	ctdbd: Add --pidfile option Default is not to create a pid file. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 996e74d3db0c50f91b320af8ab7c43ea6b1136af)	2013-04-18 13:21:59 +10:00
Martin Schwenke	4ede763f3b	util: New functions ctdb_set_child_info() and ctdb_is_child_process() Must be called by all child processes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 59b019a97aad9a731f9080ea5be14d0dbdfe03d6)	2013-04-18 13:18:29 +10:00
Michael Adam	b1a6289b44	ctdbd: unimplement the unused SET_DMASTER control Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2e92deef5221ee651028ef87138b3113f1fece91)	2013-04-17 12:44:08 +02:00
Amitay Isaacs	9e0f8fa09c	traverse: Add CTDB_CONTROL_TRAVERSE_ALL_EXT to support withemptyrecords Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit e691df43d20871468142c8fb83f7c7303c4ec307)	2013-04-17 12:30:59 +02:00
Amitay Isaacs	dd050cd4ba	util: Add hex_decode_talloc() to decode hex string into a binary blob Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 307416afda707b687f5e89e8438e45c154a4c806)	2013-03-25 17:45:23 +11:00
Amitay Isaacs	5d7efb4cf1	ctdbd: Add an index db for message list for faster searches When CTDB is busy with lots of smbd, CTDB was spending too much time in daemon_check_srvids() which searches a list of srvids in the registered message handlers. Using a hash based index significantly improves the performance of search in a linked list. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 3e09f25d419635f6dd679b48fa65370f7860be7d)	2013-03-06 15:32:33 +11:00
Martin Schwenke	dab2f6817d	client: New generic node listing function list_of_nodes() Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a73bb56991b8c07ed0e9517ffcf0dc264be30487)	2013-02-20 14:44:38 +11:00
Martin Schwenke	689384a7b4	Logging: Fix breakage when freeing the log ringbuffer Commit a82d3ec12f0fda16d6bfa8442a07595de897c10e broke fetching from the log ringbuffer. The solution there is still generally good: there is no need to keep the ringbuffer in children created by ctdb_fork()... except for those special children that are created to fetch data from the ringbuffer! Introduce a new function ctdb_fork_no_free_ringbuffer() that does everything ctdb_fork() needs to do except free the ringbuffer (i.e. it is the old ctdb_fork() function). The new ctdb_fork() function just calls that function and then frees the ringbuffer in the child. This means all callers of ctdb_fork() have the convenience of having the ringbuffer freed. There are 3 special cases: * Forking the recovery daemon. We want to be able to fetch from the ringbuffer there. * The ringbuffer fetching code. Change the 2 calls in this code (main daemon, recovery daemon) to call ctdb_fork_no_free_ringbuffer() instead. While we're here, clear the log ringbuffer when the recovery deamon is forked, since it will contain a copy of the messages from the main daemon. Note to self: always test... even the most obvious patches... ;-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db5fa00474f8a83f1aa3b603fd756cc9b49ff4)	2013-02-07 11:26:29 +11:00
Martin Schwenke	bc5f0a2b65	ctdbd: Remove command-line option --debug-hung-script Use an environment variable instead. This just means that the initscript exports CTDB_DEBUG_HUNG_SCRIPT and the code checks for the environment variable. The justification for this simplification is that more debug options will be arriving soon and we want to handle them consistently without needing to add a command-line option for each. So, the convention will be to use an environment variable for each debug option. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0581f9a84e58764d194f4e04064c2c5b393c348b)	2013-02-05 16:05:13 +11:00
Martin Schwenke	f2428cadd8	ctdbd: Remove debug_hung_script_ctx The only allocation against this context is by ctdb_fork_with_logging(). This memory is freed by ctdb_log_handler() anyway. There should be no memory leak. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 501461cc3e132d4adee9e91b5d4513a26bae2846)	2013-02-05 16:05:13 +11:00
Martin Schwenke	f2ba0e8a65	Logging: New function ctdb_log_ringbuffer_free() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a4f622e85168f59417c11705f1734e0352e1d44a)	2013-02-05 12:40:30 +11:00
Amitay Isaacs	4a6fa39ff9	daemon: Protect against double free of callback state while shutting down When CTDB is shut down and monitoring has been stopped, monitor_context gets freed and all the callback states hanging off it. This includes callback state for current_monitor, if the current monitor event has not yet finished. As a result, when the shutdown event is called, current_monitor->callback state is not NULL, but it's actually freed and it's a dangling reference. So before executing callback function and freeing callback state check if ctdb->monitor->monitor_context is not NULL. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7d8546ee4353851f0543d0ca2c4c67cb0cc75aea)	2013-01-09 14:39:23 +11:00
Amitay Isaacs	30299c387f	daemon: On shutdown, destroy timed events that check if recoverd is active When CTDB is shutting down, recovery daemon is stopped, but the event that checks if recovery daemon is still alive is not destroyed. So recovery master is restarted during shutdown if CTDB daemon takes longer to shutdown. There are two processes that check if recovery daemon is working. 1. ctdb_check_recd() - which checks every 30 seconds if the recovery daemon process exists. 2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon fails to ping CTDB daemon. Both the events are periodic and need to be destroyed when shutting down. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 746168df2e691058e601016110fae818c6a265c3)	2013-01-09 13:20:26 +11:00
Martin Schwenke	80a2bb84e7	ctdbd: Remove debug option --node-ip, use --listen instead This effectively reverts d96cb02c2c24f9eabbc53d3d38e90dea49cff3e0 Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 496387a585b2c5778c808cf02b8e1435abde4c3e)	2013-01-07 10:35:39 +11:00
Amitay Isaacs	a73f13ada7	daemon: Add a tunable to enable automatic database priority setting Samba versions 3.6.x and older do not set the database priority. This can cause deadlock between Samba and CTDB since the locking order of database will be different. A hack was added for automatic promotion of priority for specific databases to avoid deadlock. This code should not be invoked with Samba version 4.x which correctly specifies the priority for each database. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 4a9e96ad3d8fc46da1cd44cd82309c1b54301eb7)	2013-01-05 01:14:57 +01:00
Amitay Isaacs	13518b9e33	daemon: Check if log_latency_ms is set before using it This fixes a bug where wrong variable is checked. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f81e9add466b1d9b2796c09c6ba63b77296ea149)	2012-11-30 12:21:30 +11:00
Amitay Isaacs	442d9905fe	locking: Do not use RECLOCK for tracking DB locks and latencies RECLOCK is for recovery lock in CTDB. Do not override the meaning for tracking locks on databases. Database lock latency has nothing to do with recovery lock latency. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 54e24a151d2163954e5a2a1c0f41a2b5c19ae44b)	2012-11-14 15:51:59 +11:00
Amitay Isaacs	85c8deca3f	recoverd: Track the nodes that fail takeover run and set culprit count If any of the nodes fail takeover run (either due to timeout or failure to complete within takeover_timeout interval) from main loop, recovery master will give up trying takeover run with following message: "Unable to setup public takeover addresses. Try again later" And as a side-effect the monitoring is disabled on all the nodes. Before ctdb_takeover_run() is called from main loop, monitoring get disabled via startrecovery event. Since ctdb_takeover_run() fails, it never runs recovered event and monitoring does not get re-enabled. In main_loop, ctdb_takeover_run() is called with a takeover_fail_callback. This callback will get called if any of the nodes fail in handling takeip/releaseip/ipreallocated events in ctdb_takeover_run(). Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a5c6bb1fffb8dc3960af113957a1fd080cc7c245)	2012-11-14 10:59:54 +11:00
Martin Schwenke	db5dfe891c	recoverd: Add CTDB_SRVID_GETLOG and CTDB_SRVID_CLEARLOG These support getting and clearing logs from the ring-buffer in the recovery daemon. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cbca233d1e03b2410e0bb63b936328d4a8b3c7b4)	2012-10-22 11:15:36 +11:00
Amitay Isaacs	bc126ccdd4	build: Set CTDB_PATH to /tmp/ctdb.socket if SOCKPATH is not defined When building samba with CTDB, if samba configure/waf does not support setting of SOCKPATH, fallback to /tmp/ctdb.socket. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a9511cf5ecd5bc39b0070f0afa8ac4d4926c6cab)	2012-10-22 09:01:27 +11:00
David Disseldorp	8cbf1a00c4	Build: Set the default ctdb socket path at configure time The ctdb socket path currently defaults to /tmp/ctdb.socket and can be modified at runtime using the --socket=filename option, common to both ctdb and ctdbd binaries. This change allows the default path to be set at configure time using the --with-socketpath=FILE argument. When not specified, the default path remains /tmp/ctdb.socket, documentation remains unchanged as a result. Signed-off-by: David Disseldorp <ddiss@samba.org> (This used to be ctdb commit f92b9c83a2f39fba9a141417a88de96fc8c592ff)	2012-10-21 01:39:08 +11:00
Amitay Isaacs	a00e50e503	ctdbd: Replace lockwait with locking API and remove ctdb_lockwait.c Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2126795153dacb255e441abcb36ee05107b6282a)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	83306337df	ctdbd: locking: Provide non-blocking API for locking of TDB record/db/alldb This introduces a consistent API for handling locks on single record, complete db or all dbs. The locks are taken out in a child process. In cases of timeout, find the processes that currently hold the lock and log. Callback functions for locking requests take locked boolean to indicate whether the lock was successfully obtained or not. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1af99cf0de9919dd89af1feab6d1bd18b95d82ff)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	1011d10a51	common: Add routines to get process and lock information Currently these functions are implemented only for Linux. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit be4051326b0c6a0fd301561af10fd15a0e90023b)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	ef79dc012e	header: Added DB statistics update macros Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a0cdfae7438092f5c605f0608daa536be860b7fe)	2012-10-20 02:48:44 +11:00
Martin Schwenke	8d7562f3f8	common: Debug ctdb_addr_to_str() using new function ctdb_external_trace() We've seen this function report "Unknown family, 0" and then CTDB disappeared without a trace. If we can reproduce it then this might help us to debug it. The idea is that you do something like the following in /etc/sysconfig/ctdb: export CTDB_EXTERNAL_TRACE="/etc/ctdb/config/gcore_trace.sh" When we hit this error than we call out to gcore to get a core file so we can do forensics. This might block CTDB for a few seconds. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7895bc003f087ab2f3181df3c464386f59bfcc39)	2012-10-18 20:05:42 +11:00
Martin Schwenke	4b4e4d8870	ctdbd: Stop takeovers and releases from colliding in mid-air There's a race here where release and takeover events for an IP can run at the same time. For example, a "ctdb deleteip" and a takeover initiated by the recovery daemon. The timeline is as follows: 1. The release code registers a callback to update the VNN. The callback is executed after the eventscripts run the releaseip event. 2. The release code calls the eventscripts for the releaseip event, removing IP from its interface. The takeover code "updates" the VNN saying that IP is on some iface.... even if/though the address is already there. 3. The release callback runs, removing the iface associated with IP in the VNN. The takeover code calls the eventscripts for the takeip event, adding IP to an interface. As a result, CTDB doesn't think it should be hosting IP but IP is on an interface. The recovery daemon fixes this later... but it shouldn't happen. This patch can cause some additional noise in the logs: Release of IP 10.0.2.133/24 on interface eth2 node:2 recoverd:We are still serving a public address '10.0.2.133' that we should not be serving. Removing it. Release of IP 10.0.2.133/24 rejected update for this IP already in flight recoverd:client/ctdb_client.c:2455 ctdb_control for release_ip failed recoverd:Failed to release local ip address In this case the node has started releasing an IP when the recovery daemon notices the addresses is still hosted and initiates another release. This noise is harmless but annoying. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit bfe16cf69bf2eee93c0d831f76d88bba0c2b96c2)	2012-10-11 12:10:45 +11:00
Martin Schwenke	79ea15bf96	ctdbd: New tunable NoIPTakeoverOnDisabled Stops the behaviour where unhealthy nodes can host IPs when there are no healthy nodes. Set this to 1 when an immediate complete outage is preferred when all nodes are unhealthy. The alternative (i.e. default) can lead to undefined behaviour when the shared filesystem is unavailable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a555940fb5c914b7581667a05153256ad7d17774)	2012-10-11 12:10:45 +11:00
Volker Lendecke	a68512c7d8	Correct include for ctdb_protocol.h With an old ctdb_protocol.h installed under /usr/local, ctdb will not compile because the <> form of include will find the header under /usr/local (This used to be ctdb commit c4f5a58471b206e2287c7958c7f29c1f1c0626ac)	2012-10-09 23:13:29 +11:00
Martin Schwenke	e05fc0e7b0	libctdb: add ctdb_getcapabilities() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 140fafef23050d40d66f5b5558c7efcb78f80cd2)	2012-09-28 17:05:34 +10:00
Ronnie Sahlberg	d21337a0fb	Add new command to find which interface is located on (This used to be ctdb commit f07376309e70f5ccdb7de8453caacc71b451ab48)	2012-06-20 15:11:49 +10:00
Ronnie Sahlberg	59565c05cf	STATISTICS: Add tracking of the 10 hottest keys per database measured in hopcount and add mechanisms to dump it using the ctdb dbstatistics command (This used to be ctdb commit 8307c70ed98996b430c470e9641a09fdeeb81bd8)	2012-06-13 16:19:18 +10:00
Amitay Isaacs	7631830152	server: Replace BOOL datatype with bool, True/False with true/false Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6e5cbe8fff71985e5a2fc16b7e9f2b868011ff5d)	2012-05-28 11:22:25 +10:00
Ronnie Sahlberg	e7d21834ae	RECOVER: When we pull databases during recovery, we used to reallocate the databuffer for each entry added. This would normally not be an issue, but for cases where memory is fragmented, this could start to cost significant cpu if we need to reallocate and move to a different region. Change this to instead preallocate , by default, 10MByte chunks to the data buffer. This significantly reduces the number of potential reallocate and move operations that may be required. Create a tunable to override/change how much preallocation should be used. (This used to be ctdb commit 1f262deaad0818f159f9c68330f7fec121679023)	2012-05-25 12:34:06 +10:00
Ronnie Sahlberg	26322d257d	DEBUG: Add checks for and print debug messages when 1) a database contains very many records, 2) when a database is very big, 3) when a single record is very big. Add tunables to control when to log these instances and allow it to be completely turned off by setting the threshold to 0 (This used to be ctdb commit 9ed58fef4991725f75509433496f4d5ffae0ae87)	2012-05-21 13:26:13 +10:00
Ronnie Sahlberg	dce5969d12	Debug: When scripts hang, we may need to collect additional data in order to debug why the script hung. Break this debug and datacollection out into an external script to make it easier to modify what data we need to collect. For now we only collect a pstree so we can see what part of the script we hung in. S1037271 (This used to be ctdb commit 6e68797af67bee36f2bad045f94806e7e98f27e9)	2012-05-17 10:29:03 +10:00
Ronnie Sahlberg	a57eba2bb4	Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)	2012-05-03 14:03:26 +10:00
Ronnie Sahlberg	a367fa6138	RELOADIPS: simplify the reloadips code a bit and also update the "read public address file" to not check if the address exists already locally when we read if from the child process, to stop it from spamming the logs with "We already host ..." messages (This used to be ctdb commit 334ea830f1bf33419f4a1e78f23afd41a852d0f4)	2012-05-01 15:34:26 +10:00
Ronnie Sahlberg	7a1aa560e7	Add new control to reload the public ip address file on a node Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster. Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy. (This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0)	2012-05-01 10:48:08 +10:00
Amitay Isaacs	131d35d67d	includes: Move special tevent defines from tevent.h to includes.h This allows to build against system tevent library. Also include tevent header along with other common headers. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 9ae4389c2c959c5dcd8395fdae2b25ed7e1e873a)	2012-04-13 17:28:14 +10:00
Martin Schwenke	fbe64dec01	Undo damage done by d8d37493478a26c5f1809a5f3df89ffd6e149281 The implementation of DisableIPFailover got intermingled with --nopublicipcheck. This just looks wrong - Ronnie must have been having a bad day. :-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5083b266dd68b292c4275505f3d1b878dbf12f11)	2012-03-22 15:34:52 +11:00
Ronnie Sahlberg	2456f77ca6	NoIPTakeover: change the tunable name for the "dont allow failing addresses over onto the node" to NoIPTakeover (This used to be ctdb commit 35592e618cfd827b6978af6332f80504f232c46a)	2012-03-22 11:05:15 +11:00
Ronnie Sahlberg	befa9df152	Make NoIPFailback a node local setting. Nodes that have NoIPFailback set to !0 can not takeover new ip addresses during failover. Remove the old global setting for this unused tunable and add it as a new node flag. This node flag is only valid/defined within the takeover subsystem in the recovery daemon. Add async functions to collec the NoIPFailback settings for each node. This will later e used to disqualify certain nodes from being takeover targets when we perform reallocation. (This used to be ctdb commit 668f3e88a9e5f598706952b7140547640c85a5ed)	2012-03-22 09:09:57 +11:00
Ronnie Sahlberg	fa3a06246a	STICKY: add prototype code to make records stick to a node to "calm" down if they are found to be very hot and accessed by a lot of clients. This can improve performance and stop clients from having to chase a rapidly migrating/bouncing record (This used to be ctdb commit d0d98f7e45e5084b81335b004d50bddc80cdc219)	2012-03-20 17:12:19 +11:00
Ronnie Sahlberg	e7e51ddb64	LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node This can improve performance slightly on certain workloads where smbds frequently read from the same record (This used to be ctdb commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504)	2012-03-20 12:26:22 +11:00
Ronnie Sahlberg	6a493a0b08	STATISTICS: add per-db hop count statistics (This used to be ctdb commit 1c976d83b1d7dac6f0ef81306774998e4c8b56a1)	2012-03-20 12:11:55 +11:00
Ronnie Sahlberg	c051f67d67	FETCH COLLAPSE : Change the fetch-lock collapse to collapse ALL fetches, including fetch-locks into a single command in flight per record. Also add a tunable to enable/disable this optimization for hot records (This used to be ctdb commit eafd7bbaaa5931546a96c8beae3cf9a39a49c925)	2012-03-20 11:39:00 +11:00
Ronnie Sahlberg	038c946e80	add max hop count buckets to see how bad hopcounts are (This used to be ctdb commit 7d3931298e6477d92f43652c3006b0c426cb1307)	2012-03-20 11:20:53 +11:00
Ronnie Sahlberg	f3600276fc	Add a tunable variable to control how long we defer after a ctdb addip until we force a rebalance and try to failback addresses onto this node Have it default to 300 seconds. (This used to be ctdb commit 49791db7dc74cffd7e88bd73091590cdc1909328)	2012-02-28 06:58:59 +11:00
Ronnie Sahlberg	ef2bd0b016	When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2)	2012-02-28 06:56:04 +11:00
Ronnie Sahlberg	93ec9c589c	Eventscripts: remove the horrible horrible circular reference between state and callback since these two structures do not even share the same parent talloc context. Instead, tie them together via referencing a permanent linked list hung off the ctdb structure. (This used to be ctdb commit a95c02da6c67dc4bd8716b75318a4188301df6f9)	2012-02-23 06:49:47 +11:00
Ronnie Sahlberg	42e477b14e	READONLY: only send a control to schedule fast-vacuuming from child context iff we have a connection open to the main daemon there are some child processes where we do not create a connection to the main daemon (switch_from_server_to_client()) because it is expensive to set up and we normally might not need to talk to the daemon at all via a domainsocket. but we might want to still call to ctdb_ltdb_store() from such chil processes. (This used to be ctdb commit 9e372a08c40087e6b5335aa298e94d88273566a5)	2012-02-21 07:03:44 +11:00
Ronnie Sahlberg	73f8be16c6	ReadOnly: add per-database statistics to view how much delegations/revokes we have (This used to be ctdb commit 751ed46197661eb841042ab6a02855a51dd0b17c)	2012-02-08 15:29:27 +11:00
Ronnie Sahlberg	1eafa68f0f	STATISTICS: add total counts for number of delegations and number of revokes Everytime we give a delegation to another node we count this as one delegation. If the same record is delegated to several nodes we count one for each node. Everytime a record has all its delegations revoked we count this as one revoke. (This used to be ctdb commit b098bcf8007be63889aaed640a951b0eeaa9d191)	2012-02-08 13:42:30 +11:00
Martin Schwenke	ed8a8ee966	libctdb - add ctdb_getvnnmap() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f6039eaece4224b866a98dd49010f278a7b3f015)	2012-02-06 16:00:23 +11:00
Ronnie Sahlberg	e648045499	Merge branch 'master' of ssh://git.samba.org/data/git/ctdb (This used to be ctdb commit 15d8ae8b0f80f95d7839528b8ac60aa0e2485c77)	2012-01-03 12:40:15 +11:00
Michael Adam	e04fad0ee4	vacuum: add new tunable VacuumInterval and mark Vacuum{Default,Min,Max}Interval obsolete And use VacuumInterval instead of VacuumDefaultInterval in the code. (This used to be ctdb commit 78530f40338f511a7cd1d33ada450905742bfa8f)	2011-12-23 17:39:02 +01:00
Michael Adam	a481ca711f	vacuum: add ctdb_local_remove_from_delete_queue() Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit a5065b42a98c709173503e02d217f97792878625)	2011-12-23 17:39:00 +01:00
Martin Schwenke	8b74037633	ctdb tool - generalise nodestring parsing for -n Centralise -n nodestring parsing and add the ability to pass a comma-separated list of node numbers. Listing a node that is disconnected or deleted results in failure, similar to the way passing a single node currently works. All of the auto_all commands inherit this functionality. For now, the non-auto_all commands do not inherit this - they need to be individually tweaked. Therefore, we haven't updated the documentation to advertise this feature. Implemented via a new function parse_nodestring() that parses an optional (pass NULL when not available to indicate "current node") comma-separated list of node numbers or "all". parse_nodestring() can be told to be non-fatal for disconnected/deleted nodes so it can also be used in other contexts (yes, coming soon). main() is changed to call this function. A new magic PNN value CTDB_MULTICAST is added and along with a corresponding option.nodes structure member (a talloc-ed array of PNNs). This is also populated for "all" as well. control_status() has new function pretty_print_flags() factored out so pretty-printed flags can be used in error/debug messages. New function is_partially_online() is also factored out - this simplifies some of the logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 920e3a732eb9e09004edde6cfb3c7db8a004016f)	2011-12-08 17:00:17 +11:00
Ronnie Sahlberg	609149bdc8	LibCTDB: Add support for the 'get interfaces' control and update the ctdb tool to use this interface (This used to be ctdb commit 77dc0c7351071243d9096d3607d7499c82f46ec0)	2011-12-06 13:12:18 +11:00
Mathieu Parent	bb3d6698e9	Move platform-specific code to common/system_* This removes #ifdef AIX and ease the addition of new platforms. (This used to be ctdb commit 2fd1067a075fe0e4b2a36d4ea18af139d03f17bf)	2011-12-06 11:57:11 +11:00
Michael Adam	ad0de5494e	traverse: fix traversing with empty records by adding a new (internal) control CTDB_CONTROL_TRAVERSE_START_EXT By this, the original CTDB_CONTROL_TRAVERSE_START control that is used by e.g. samba's smbstatus, is not changed, so that samba continues working without code change. The CTDB_CONTROL_TRAVERSE_START currently just adds the "withemptyrecords" flag to the state and processon on as CTDB_CONTROL_TRAVERSE_START_EXT. (This used to be ctdb commit 8281bb210858ed04992eacea7f6d02261e0fc1b1)	2011-12-03 02:15:30 +01:00
Ronnie Sahlberg	11f3c947e6	LibCTDB: add support for the check-srvids control (This used to be ctdb commit c32604fd0016de0df14845a2f222edaa3c52a4fa)	2011-11-30 10:00:07 +11:00
Volker Lendecke	5a1da0ac55	Add CTDB_CONTROL_CHECK_SRVID (This used to be ctdb commit ad64ef2c40a2a12b37dbf39142e95c6781c2fc3b)	2011-11-30 09:02:26 +11:00
Ronnie Sahlberg	0420449a6c	Recover Persistent database DB by DB and not record by record Add a new tunable that changes the mode how persistent databases are recovered. RecoveryPDBBySeqNum When set to 1, persistent databases will be recovered in whole from the node which has the highest "__db_sequence_number__" record. This record is managed by samba for those databases where we do persistent writes and have inter-record relations. For these databases we do not want the usual "blend records from all nodes based on individual record RSN" but instead a mode where we pick one instance of the persistent database. If no node was found with a "__db_sequence_number__" record at all, we fail back to the original "recover records independently based on record RSN". Some persistent databases do not contain record interrelations and as such does not contain this special record at all. (This used to be ctdb commit 502150c764298a9fa8c4d8aa445bf7d85d4ee9dc)	2011-11-30 08:48:23 +11:00
Ronnie Sahlberg	3cbff2edd8	LibCTDB: add get persistent db seqnum control (This used to be ctdb commit 6e96a62494bbb2c7b0682ebf0c2115dd2f44f7af)	2011-11-30 08:48:14 +11:00
Michael Adam	31d62794fe	ctdb: add an option --print-recordflags to trigger printing record flags in catdb and dumpdbbackup This changes the default behaviour to not print record flags. (This used to be ctdb commit 2d2ce07c51055d9400b22cd3c1fd682597cb921c)	2011-11-29 13:43:35 +01:00
Michael Adam	e6923904e8	ctdb: add an option --print-hash to enable printing of record hashes when dumping dbs (This used to be ctdb commit efc033c28ade97f9884794256d59a4553e052d5f)	2011-11-29 13:43:34 +01:00
Michael Adam	86cd78efee	ctdb: add an option --print-lmaster to enable printing of lmaster in "ctdb catdb" (This used to be ctdb commit 326f88ef622620cb9e0569c4497bc0e86124beaa)	2011-11-29 13:43:33 +01:00
Michael Adam	dc98c12ac9	ctdb: add an option --print-datasize to only print datasize instead of dumping data in db dumps Used in catdb, cattdb and dumpdbbackup. (This used to be ctdb commit dd866116041e71cbf91e7fd91edcc9501634051d)	2011-11-29 13:43:32 +01:00
Michael Adam	1fcc7651f4	ctdb: add an option --print-emptyrecords to enable printing of empty records in dumping databases this option is used with the commands catdb, cattdb and dumpdbbackup. (This used to be ctdb commit 6ec68a2e667f66d2b194fe48cb75229a2777842e)	2011-11-29 10:30:24 +01:00
Michael Adam	1a31c84348	traverse: add a flag to enable transferring empty records in cluster wide traverse This will be useful for also printing information about empty/deleted records in "ctdb catdb", e.g. for debugging vacuuming issues. (This used to be ctdb commit ddc5da3a0df7701934404192a0a0aa659a806acb)	2011-11-29 10:30:24 +01:00
Martin Schwenke	3ae8273d86	Make some ctdb_takeover.c functions static These were intentionally not static so they could be linked to in unit test programs. However, using the CCAN-style unit tests where relevant code is just included, this is no longer necessary. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d0e9e8554614bd49ffb9ec3509feaa0e80d0f65d)	2011-11-11 14:41:47 +11:00
Martin Schwenke	f186dd90b6	Move some common functions to common/ctdb_ltdb.c Move identical copies of ctdb_null_func(), ctdb_fetch_func(), ctdb_fetch_with_header_func() from ctdb_client.c and ctdb_ltdb_server.c to somewhere common. This is in the context of wanting to run CCAN-style tests where most of the ctdbd code is just included in the test program. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 126cb0d369b2b1aed63801dc4ba0554399e8b7e4)	2011-11-11 14:31:50 +11:00
Martin Schwenke	52ff485958	Added some #ifndefs to stop files being included multiple times. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fdca12c25e6fce6206135b994dedf44265e4eb09)	2011-11-11 14:31:50 +11:00
Ronnie Sahlberg	44de394796	SRVID ranges: Change the ranges for SRVIDs to allow 8 bit prefixes Update the ranges used for SRVID allocation to allow 8 bit prefixes and thus 56 user-defined bits. Define the defacto-use of the 0x00 prefix as a SRVID used to register a process id Upgrade SAMBA/iSCSI/NFS/TEST from a 32 bit prefix each ot a 8 bit prefix each for private use. (This used to be ctdb commit 5de9ec2bdf8067406165bc470becdca87f458ae9)	2011-11-09 08:12:44 +11:00
Ronnie Sahlberg	0e79b2d1e8	Record Fetch Collapse: Collapse multiple fetch request into one single request. When multiple clients fetch the same record concurrently, send only one single fetch across the network and deferr all other fetches locally. This improves performance for hot records and reduces cpu load on ctdb. (This used to be ctdb commit 82d6946ad8b3348e8b9d3d971f24925ade02d1be)	2011-11-08 16:08:28 +11:00
Ronnie Sahlberg	c21ec9fffc	ReadOnly: add readonly record lock requests to libctdb Initial readonly record support in libctdb. New records are not yet created by the library but extising records will be delegated as readonly records. This needs a bit more tests before we can drop the "old style" implementation of client code in client/ctdb_client.c (This used to be ctdb commit fb50a45a21ff56480d76acd1c33c13c323cbf5e2)	2011-10-28 11:55:46 +11:00
Ronnie Sahlberg	8e4bfba75c	ReadOnly: Rename the function ctdb_ltdb_fetch_readonly() to ctdb_ltdb_fetch_with_header() since this is what it actually does. (This used to be ctdb commit 94a5ce4e08e7891f07dbfe4c822ca4be5ab10965)	2011-09-13 18:38:20 +10:00
Ronnie Sahlberg	0dc5584101	Merge branch 'master-readonly-records' into foo Conflicts: Makefile.in tools/ctdb.c (This used to be ctdb commit 0fedef0ffba4178126eee9544c5e2db52f5db893)	2011-09-12 09:34:34 +10:00
Ronnie Sahlberg	1c05db2c9c	Merge remote branch 'ddiss/master_pmda_and_client_timeouts' (This used to be ctdb commit 7bebfc7bad8f36e54003b8e25372fdaf54836e21)	2011-09-08 11:22:53 +10:00
David Disseldorp	2f925f1e64	pmda: Attempt reconnects while ctdbd is unavailable Attempt to reconnect to ctdbd on fetch while it is unreachable. We must provide our own queue callback wrapper, as ctdb_client_read_cb() exits on transport failure. (This used to be ctdb commit 28df6fbf1273b8d095a2bc38dca6a6c35c5c31bd)	2011-09-06 14:01:18 +02:00
David Disseldorp	5296da5609	client: add timeout argument to ctdb_attach Rather than using a fixed 2 second CTDB_CONTROL_GETDBPATH timeout. (This used to be ctdb commit 9e178671560cb95121e11d718a76b05380ecd6c5)	2011-09-06 13:57:04 +02:00
David Disseldorp	0628d1c0e6	client: add req timeout argument to ctdb_cmdline_client Following connection to the local ctdbd, ctdb_cmdline_client() currently issues a CTDB_CONTROL_GET_PNN request with a fixed 3 second timeout. The ctdb cmd line client accepts a --timelimit argument for specifying a per request timeout, pass this value through to ctdb_cmdline_client() for use as a CTDB_CONTROL_GET_PNN request timeout. (This used to be ctdb commit 0634d0305f42f17048b6830733767e8dc300e11c)	2011-09-06 13:56:54 +02:00
Ronnie Sahlberg	783ceca07b	Interface monitoring: add a event to trigger every 30 seconds to check that all interfaces referenced by the public address list actually exists. This will make it much easier to root-cause problems such as S1029023 when an external application deleted the interface while it is still is in use by ctdbd. (This used to be ctdb commit 9abf9c919a7e6789695490e2c3de56c21b63fa57)	2011-09-06 17:02:19 +10:00
Ronnie Sahlberg	64378fea58	Check interfaces: when reading the public addresses file to create the vnn list check that the actual interface exist, print error and fail startup if the interface does not exist. (This used to be ctdb commit cd33bbe6454b7b0316bdfffbd06c67b29779e873)	2011-09-06 16:11:00 +10:00
Michael Adam	a3e0079568	Add a tunable "AllowClientDBAttach" with default value 1. When set to 0, clients will not be able to attach to databases via the db_attach control. This might can be useful for maintenance where ctdb should be kept running but clients should not be able to modify databases. (This used to be ctdb commit ddfeecda87955b4e46777599f678e6926d37f4c4)	2011-09-05 16:17:39 +10:00
Ronnie Sahlberg	206a3c0c66	ReadOnly: add a new control to activate readonly lock capability for a database. let all databases default to not support this until enabled through this control (This used to be ctdb commit 908a07c42e5135a3ba30a625fc4f4e4916de197a)	2011-09-01 11:08:18 +10:00
Ronnie Sahlberg	a0d4d240c3	ReadOnly: add a readonly flag to the getdbmap control and show the readonly setting in ctdb getdbmap output (This used to be ctdb commit 4cac9ad7d9c9ca657a247a6c215476399c7d2210)	2011-09-01 10:28:15 +10:00
Ronnie Sahlberg	63dc96cdb2	ReadOnly: Change the ctdb_db structure to keep a uint8_t for flags instead of a boolean for the persistent flag. This is the same size as the original boolean but allows ut to add additional flags for the database (This used to be ctdb commit 7462761638d25880ad46024ad4ef21667eb99a98)	2011-09-01 10:21:55 +10:00
Ronnie Sahlberg	2902203900	Logging: when we log stdout/stderr messages from eventscripts to the system log, prefix every line of output with the name of the eventscript. CQ S1028412 (This used to be ctdb commit 392363c04185f47a826fc6ed95038342be2150bf)	2011-08-26 09:39:25 +10:00
Ronnie Sahlberg	b00b0e9d2e	LibCTDB : add support for getrecmode (This used to be ctdb commit b663f286ea8edd64c0405a1ab45b6ef1da501bf5)	2011-08-23 15:32:14 +10:00
Ronnie Sahlberg	5e72ee5127	LibCTDB : add support for getrecmode (This used to be ctdb commit 0893fa0f3257f50d54896ffa78ec12ee11e8c6d2)	2011-08-23 15:00:27 +10:00
Ronnie Sahlberg	af19b5acff	LibCTDB: add commands where an application can query how many commands are active and we have not yet received a reply to. Applications may use this command to query if it is "safe" to stop the event system and sleep or whether it should first wait for all activity to ctdb daemons to cease first. (This used to be ctdb commit 8d89bfdfd1f55dfeb22890b8bb0f08f31d1fa91a)	2011-08-23 12:43:16 +10:00
Ronnie Sahlberg	37608d70fc	ReadOnly: Add clientside code to fetch readonly records (This used to be ctdb commit 6fccc902bce21fa6ff13ed08ee3341bbf8be39f2)	2011-08-23 10:34:15 +10:00
Ronnie Sahlberg	1bbd4cbf35	ReadOnly: Add a ctdb_ltdb_fetch_readonly() helper function (This used to be ctdb commit 8551420fb331dd2a897f4619278a981fcefb96e8)	2011-08-23 10:33:17 +10:00
Ronnie Sahlberg	17f0e0890c	ReadOnly: Add a new flag to call request packet to indicate that the client wants a readonly delegation (This used to be ctdb commit a3f54a556e97170eedf43708d58dd32446ca5840)	2011-08-23 10:29:40 +10:00
Ronnie Sahlberg	dda2616cf5	ReadOnly: Add a function to start a revoke of all delegations for a record. This triggers a child process to be created to perform the actual potentially blocking calls that are required. (This used to be ctdb commit 7d575ee92c95bc4aab78a33bc1aac7ff0811ab3a)	2011-08-23 10:27:31 +10:00
Ronnie Sahlberg	1bb855bd52	ReadOnly: Add functions to register CALLs to a context used to handle deferal of processing of CALL commands. Once the contexts are freed, the deferred calls are re-issued to the input packet processing functions again. This is needed when/if a CALL can not currently be processed by the main engine due to the record being locked down for revoking of all delegations. The data is passed through several layers of callbacks, and finally a timed event callback to ensure that the processing of the packet will be restarted again at the topmost eventloop, avoinding event loop nesting. (This used to be ctdb commit cc6f78efcfa3b8caeffbd68018e6dfbf81488dce)	2011-08-23 10:25:57 +10:00
Ronnie Sahlberg	3d495c48d2	ReadOnly: Add an extra flag to ctdb_call_local to specify whether we want to write the record and header back to the tdb (for example we do when performing dmaster migrations) (This used to be ctdb commit b935e83255aeb3754b2fd37cf5611e02f7283514)	2011-08-23 10:25:05 +10:00
Ronnie Sahlberg	1441b77cce	ReadOnly: Add "readonly" flag to the ctdb_db_context to indicate if this database supports readonly operations or not. Add a private lock-less tdb file to the ctdb_db_context to use for tracking delegarions for records Assume all databases will support readonly mode for now and se thte flag for all databases. At later stage we will add support to control on a per database level whether delegations will be supported or not. (This used to be ctdb commit 502f86f79944df4bac9094f716e54110c511dc24)	2011-08-23 10:24:26 +10:00
Ronnie Sahlberg	8f63a5dadd	ReadOnly: Add 4 new record flags to handle read only delegation and revoking of delegations (This used to be ctdb commit 875b0bede217547b51f02648b6a28a3c98b6b949)	2011-08-23 10:17:08 +10:00
Ronnie Sahlberg	e8127f0e0f	ReadOnly: Add clientside functions to send the UPDATE_RECORD control (This used to be ctdb commit 74a5b3d7bafd8827a4ee80095fde5798263821e4)	2011-08-23 10:11:38 +10:00
Ronnie Sahlberg	f924b3f40e	ReadOnly: Add helper functions to manipulate a TDB_DATA as a bitmap for nodes that we are tracking as having a readonly delegation (This used to be ctdb commit d10084e62d37674bb8d9e31d457fd23e050545be)	2011-08-23 10:09:42 +10:00
Ronnie Sahlberg	00a870f759	ReadOnly records: Add a new RPC function FETCH_WITH_HEADER. This function differs from the old FETCH in that this function will also fetch the record header and not just the record data (This used to be ctdb commit c7196d16e8e03bb2a64be164d15a7502300eae0e)	2011-08-23 10:06:59 +10:00
Volker Lendecke	21bb8abc93	libctdb: "ctdb_request_free" does not need the ctdb_connection parameter Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 5a5ed2a43b76bec69494b6cdc6451527f5c472e5)	2011-08-22 17:11:07 +02:00
Martin Schwenke	5ac67504ca	Tests: Initial test code for LCP2 IP allocation algorithm. Move struct ctdb_public_ip_list to ctdb_private.h and put some definitions for some functions from ctdb_takeover.c there. This allows those functions to be called from unit tests. Add ctdb_takeover_tests.c and the Makefile support to build it. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9d34be0233edf3bc022345c0494c4b2a4d7f8480)	2011-07-29 09:01:36 +10:00
Martin Schwenke	ff1a81c872	IP allocation - add LCP2 algorithm. The current non-deterministic IP allocation algorithm balances IPs across the whole cluster. It does not consider different interfaces/VLANs/subnets, so these different groups of IPs aren't generally well balanced. This adds the LCP2 algorithm for IP allocation and allows it to be enabled by setting the "LCP2PublicIPs" tunable to 1. The LCP2 algorithm calculates the imbalance of a node by totalling the squares of the distances between each IP on the node. The IP distance is defined as the length longest common prefix (LCP) of bits that is found when comparing 2 IPs. The imbalance of a cluster is the maximum imbalance for any node. At each step the algorithm selects an allocation to the IP/node combination that results in the choosing the allocation that best reduces the imbalance of the cluster. The implementation splits out the IP allocation part of ctdb_takeover_run() into new function ctdb_takeover_run_core(), and then extracts out the basic IP assignment code into new functions basic_allocate_unassigned() and basic_failback(). 3 new functions lcp2_init(), lcp2_allocate_unassigned() and lcp2_failback() implement the LCP2 algorithm, and are hooked into ctdb_takeover_run_core(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 61fc7fbd0235469df22deb6581c6bd47e30bc0be)	2011-07-29 09:01:17 +10:00
Michael Adam	827e871ec4	ctdb_private.h: add record flag CTDB_REC_FLAG_AUTOMATIC This is a flag that shall signa that a record has been automatically generated by ctdb and not by an explicit client store operation. This will be used in the ctdb_ltdb_fetch operation which stores an empty record with default initial header before trying to migrate the record from the dmaster when the record does not exist in the local tdb. (This used to be ctdb commit 46381a3cb58ccc11422af8f7798c80ea8d72294f)	2011-03-14 13:35:51 +01:00
Michael Adam	9e8d6b82b5	server: Use the ctdb_ltdb_store_server() in the ctdb daemon for non-persistent dbs This is realized by adding a ctdb_ltdb_store_fn function pointer to the db context and filling it in the attach procedure for non-persistent dbs. (This used to be ctdb commit df49ec44de80affa5ccc637dec12a20a26e8706e)	2011-03-14 13:35:50 +01:00
Michael Adam	a6b13b21c1	client: add accessor function ctdb_header_from_record_handle(). (This used to be ctdb commit cf57efd440ccc3db381386f4749bfcbf8ac5ecae)	2011-03-14 13:35:50 +01:00
Michael Adam	50bd249990	vacuum: add ctdb_local_schedule_for_deletion() (This used to be ctdb commit b70bc141d84f7355d2c6c901961b7366db566980)	2011-03-14 13:35:49 +01:00
Michael Adam	8569fcbc83	server: implement a new control SCHEDULE_FOR_DELETION to fill the delete_queue. (This used to be ctdb commit 680223074e992b32ccf6f42cb80c3fa93074fee7)	2011-03-14 13:35:49 +01:00
Michael Adam	46a05397a4	control: add a new control opcode CTDB_CONTROL_SCHEDULE_FOR_DELETION (This used to be ctdb commit 4cebfa33db3c7effa087f753530c52b2dd8550e6)	2011-03-14 13:35:49 +01:00
Michael Adam	77d4d156d3	control: add macro CHECK_CONTROL_MIN_DATA_SIZE. This is for the control dispatcher to check whether the input data has a required minimum size. (This used to be ctdb commit 2038e745db33cc5c3b4e2db8a00a57ede03906a2)	2011-03-14 13:35:49 +01:00
Michael Adam	9d20f76052	Add a tunable VacuumFastPathCount. This will control how many fast-path vacuuming runs wil have to be done, before a full vacuuming will be triggered, i.e. one with a db-traversal. (This used to be ctdb commit 0d997ec7e61a7bee2cb05456f9c7d5e6f7a44797)	2011-03-14 13:35:47 +01:00
Michael Adam	cd061f3dee	Add a delete_queue to the ctdb database context struct. This list will be filled by the client using a new delete control. The list will then be used to implement a fast-path vacuuming that will traverse this list instead of traversing the database. (This used to be ctdb commit 9bbedf786b26bb074f668b31f29a9032af958673)	2011-03-14 13:35:45 +01:00
Michael Adam	f7eeb42219	add a new record flag CTDB_REC_FLAG_VACUUM_MIGRATED. This is to be used internally. The purpose is to flag a record as been migrated by a VACUUM_MIGRATION, which is triggered by a VACUUM_FETCH message as part of the vacuuming. The local store routine will base its decision whether to delete or to store the record (among other things) upon the value of this flag. This flag should never be stored in the local database copies. (This used to be ctdb commit dd2449c422f323f9b5485e45107a9cc5acc09e08)	2011-03-14 13:35:44 +01:00
Michael Adam	f3fbd31d85	call: Move definition of call flags down to the definition of the flags field. (This used to be ctdb commit 86c844fb08a7fd33e94f56b8d5e43278120e1162)	2011-03-14 13:35:44 +01:00
Michael Adam	a2c11d6edc	call: add new call flag CTDB_CALL_FLAG_VACUUM_MIGRATION This is to be used when the CTDB_SRVID_VACUUM_FETCH message triggers the migration of deleted records to the lmaster. The lmaster can then delete records that have not been migrated with data instead of storing them. (This used to be ctdb commit 455cc6616e10b7f09589f9b87cb60f591bb502b0)	2011-03-14 13:35:44 +01:00
Ronnie Sahlberg	8acb677c9c	Deferred attach : at early startup, defer any db attach calls until we are out of recovery. (This used to be ctdb commit eeaabd579841f60ab2c5b004cbbb1f5de2bfe685)	2011-03-01 12:13:34 +11:00
Michael Adam	2bd04f0ff8	persistent: add ctdb_persistent_finish_trans3_commits(). This function walks all databases and checks for running trans3 commits. It sends replies to all of them (with error code) and ends them. To be called when a recovery finishes. (This used to be ctdb commit 70ba153b532528bdccea70c5ea28972257f384c1)	2011-02-24 10:35:26 +01:00
Michael Adam	ace1efb878	persistent: add a ctdb_persistent_state member to the ctdb_db context. To be used for tracking running transaction commits through recoveries. (This used to be ctdb commit 1237e15df4af58a3d220eea42a4b75e21e65029f)	2011-02-24 10:35:25 +01:00
Ronnie Sahlberg	65f44e159f	Add two new flags for the ltdb header. One of which signals that the record has never been migrated to/from a node while containing data. This property "has never been migrated while non-zero" is important later to provide heuristics on which records we might be able to purge from the tdb files cheaply, i.e. without having to rely on the full-blown database vacuum. These records are belived to be very common and the pattern would look like this : 1, no record exists at all. 2, client opens a file 3, samba requests the record for this file 4, an empty record is created on the LMASTER 5, the empty record is migrated to the DMASTER 6, samba writes a <sharemode> to the record locally and the record grows 7, client finishes working the file and closes the file 8, samba removes the sharemode and the record becomes empty again. 9, much later : vacuuming will delete the record At stage 8, since the record has never been migrated onto a node wile being non-zero it would be safe, and much more efficient to just delete the record completely from the database and hand it back to the LMASTER. The flags occupy the same uint32_t as was previously used for laccessor/lacount in the header. For now, make sure the flags only define/use the top 16 bits of this field so that we are sure we dont collide with bits set to one from previous generations of the ctdb cluster database prior to this change in semantics of this word. This is a rework of Michaels patch : commit 2af1a47cbe1a608496c8caf3eb0c990eb7259a0d Author: Michael Adam <obnox@samba.org> Date: Tue Nov 30 17:00:54 2010 +0100 add a DEFAULT record flag and a MIGRATED_WITH_DATA record flag. (This used to be ctdb commit e075670dee8e6ecaba54986f87a85be3d0528b6b)	2011-02-18 10:14:56 +11:00
Ronnie Sahlberg	b57bd0f896	Remove LACOUNT and LACCESSOR and migrate the records immediately. This concept didnt work out and it is really just as expensive as a full migration anyway, without the benefit of caching the data for subsequence accesses. Now, migrate the records immediately on first access. This will be combined with a "cheap vacuum-lite" for special empty records to prevent growth of databases. Later extensions to mimic read-only behaviour of records will include proper shared read-only locking of database records, making the laccessor/lacount read-only access to the data obsolete anyway. By removing this special case and handling of lacount laccessor makes the codapath where shared read-only locking will be be implemented simpler, and frees up space in the ctdb_ltdb header for use by vacuuming flags as well as read-only locking flags. (This used to be ctdb commit 155dd1f4885fe142c6f8bd09430f65daf8a17e51)	2011-02-18 10:08:32 +11:00
Ronnie Sahlberg	0f33605866	LockWait congestion. Add a dlist to track all active lockwait child processes. Everytime creating a new lockwait handle, check if there is already an active lockwait process for this database/key and if so, send the new request straight to the overflow queue. This means we will only have one active lockwaic child process for a certain key, even if there were thousands of fetch-lock requests for this key. When the lockwait processing finishes for the original request, the processing in d_overflow() will automagically process all remaining keys as well. Add back a --nosetsched argument to make it easier to run under gdb (This used to be ctdb commit 3e9317a2e1f687b04bf51575d47fcd4faa6e6515)	2011-01-24 12:21:58 +11:00
Rusty Russell	e57362ecf4	ctdb_lockwait: create overflow queue. Once we have more than 200 children waiting on a particular db, don't create any more. Just put them on an overflow queue, and when a child gets a lock search that queue to see if others were after the same lock (they probably were). (This used to be ctdb commit 5e614e8cfd1e9a4b13035a0e400b7a60a745b510)	2011-01-24 12:21:50 +11:00
Ronnie Sahlberg	fcd98a7e59	LIBCTDB: add support for traverse (This used to be ctdb commit 9463e04038ba36792583f83bd95c1af322dc283a)	2011-01-14 17:38:56 +11:00
Ronnie Sahlberg	c4006ce844	Add ctdb_fork(0 which will fork a child process and drop the real-time scheduler for the child. Use ctdb_fork() from callers where we dont want the child to be running at real-time privilege. (This used to be ctdb commit 58795a4c9e0624e20fa3e0023b65127053edd103)	2011-01-11 07:40:41 +11:00
Ronnie Sahlberg	ea0df6d882	Revert scheduling back to use real-time processes Revert this patch: commit 482c302d46e2162d0cf552f8456bc49573ae729d We may need to use real-time processes for the main daemon and the recovery daemon to handle the cases where systems come under very high loads. (This used to be ctdb commit 08bef9dcab6e4da15fc783f8624e5ed09aa060b5)	2011-01-11 07:40:35 +11:00
Ronnie Sahlberg	c69ada0090	add a new ctdb_ltdb function to delete a record in a normal database (This used to be ctdb commit fe9070ec9be69e6a6fcbf9899e7ced24541c9c3a)	2010-12-07 15:32:30 +11:00
Ronnie Sahlberg	83e68b62dd	delay loading the public ip address file until after we have started the transport and discovered ouw own pnn number (This used to be ctdb commit 1b57fc866fc836b5dbd3ef7b646e5a0f4280e81e)	2010-11-10 14:55:24 +11:00
Ronnie Sahlberg	5f76f3c0e2	Add a new tunable : DisableIPFailover that when set to non 0 will stopp any ip reallocations at all from happening. (This used to be ctdb commit d8d37493478a26c5f1809a5f3df89ffd6e149281)	2010-11-10 14:55:24 +11:00
Ronnie Sahlberg	5ef29f9f25	Update latency countes to show min/max and average (This used to be ctdb commit 1919e949af4641ffe919123e44b02fb87c13ab9f)	2010-10-11 15:12:24 +11:00
Ronnie Sahlberg	3ba7ac13eb	Create a tunable for how often to collect rolling statistics and initialize it to 1 second (This used to be ctdb commit cb8c779bb5d9862abbe08919aa181a1a1b2bef18)	2010-09-30 15:00:57 +10:00
Ronnie Sahlberg	9f66a93f12	Add rolling statistics that are collected across 10 second intervals. Add a new command "ctdb stats [num]" that prints the [num] most recent statistics intervals collected. (This used to be ctdb commit e6e16fcd5a45ebd3739a8160c8fb5f44494edb9e)	2010-09-29 12:14:45 +10:00
Ronnie Sahlberg	41b6e09fb1	Add a new statistics structure to keep the current running statistics (This used to be ctdb commit 09e5a2fb47c312f71f455cdbf8d9cabcca1041a4)	2010-09-29 12:14:35 +10:00
Ronnie Sahlberg	39c367a68f	Create macros to update the statistics counters and use these macros everywhere instead of manipulating the coutenrs directly. (This used to be ctdb commit 2e648df890e5713bc575965d87937827b068d0d7)	2010-09-29 12:14:24 +10:00
Ronnie Sahlberg	c6e20a06c7	set up a handler to catch and log debug messages from the tevent layer (This used to be ctdb commit fdb4c02f595fa207310a9a48da3fefd653fa9e4b)	2010-09-28 08:30:26 +10:00
Ronnie Sahlberg	22ea35f17d	adda GETPUBLICIPS control to libctdb and use this in the test example enhance the test example to show the new releaseip/takeip messages (This used to be ctdb commit 21cc57883e6c02b0e037211b26d1d866d5d7f03d)	2010-09-15 14:58:11 +10:00
Stefan Metzmacher	0b5bd411ca	server/banning: also release all ips if we're banning ourself metze (This used to be ctdb commit c386f2c62f06f1c60047b7d4b1ec7a9eec11873c)	2010-09-14 15:50:31 +10:00
Ronnie Sahlberg	d8d8b9e1d7	add a new serverid to send a message everytime an ip address is taken on the local node (This used to be ctdb commit 1261f3d9702800a4e59550c881350daf479f00ef)	2010-09-13 15:43:19 +10:00
Ronnie Sahlberg	991a6ae2a0	Update the comment for the range reserved for SAMBA and define a new symbol to represent this range similarly to NFSD and ISCSID Keep the old symbol name to be backward compatible with software using these headers. (This used to be ctdb commit 2ce34e50d057ba95249117a581658a5ad7e8eb60)	2010-09-13 15:10:36 +10:00
Ronnie Sahlberg	09a08b0da3	define and reserve a range of ctdb message ports for use by nfs and iscsi servers (This used to be ctdb commit 84a44ac8ee74dd7af15e378c6cafbedb95feec60)	2010-09-13 15:10:24 +10:00
Ronnie Sahlberg	65382a59d1	Add two new server types to the server_id structure. NFSD and ISCSID for now. (This used to be ctdb commit 4cd4bab68f0ba0305a585a2aabcb6871cdb11d96)	2010-09-13 15:10:12 +10:00
Ronnie Sahlberg	a2c874bd61	Implement a new function GETNODEMAP in libctdb. This function returns a pointer to a nodemap structure. The returned structure must later be freed by calling ctdb_free_nodemap(). Move the definition of ctdb_sock_addr from ctdb_client.h to ctdb_protocol.h Move the definition of the node flags, ctdb_node_and_flags and ctdb_node_map from ctdb_private.h to ctdb_protocol.h Add both sync and async example for ctdb_getnodemap to the test application libctdb/tst.c (This used to be ctdb commit 31c10eb2b337fd7d8a97a1f9e69b0e7570fec71d)	2010-09-13 14:32:11 +10:00
Ronnie Sahlberg	c95f4258d8	Add a new event "ipreallocated" This is called everytime a reallocation is performed. While STARTRECOVERY/RECOVERED events are only called when we do ipreallocation as part of a full database/cluster recovery, this new event can be used to trigger on when we just do a light failover due to a node becomming unhealthy. I.e. situations where we do a failover but we do not perform a full cluster recovery. Use this to trigger for natgw so we select a new natgw master node when failover happens and not just when cluster rebuilds happen. (This used to be ctdb commit 7f4c591388adae20e98984001385cba26598ec67)	2010-08-30 18:09:30 +10:00
Ronnie Sahlberg	2e8aac6689	Merge commit 'rusty/ports-from-1.0.112' into foo (This used to be ctdb commit 13e58d92f5f1723e850a82ae030d0ca57e89b1ee)	2010-08-19 13:17:56 +10:00
Ronnie Sahlberg	4c05f1900c	Merge commit 'rusty/vacuum-fix-master' (This used to be ctdb commit dc301b324d2c14a2425a965c076113c4fe97903e)	2010-08-19 13:16:35 +10:00
Ronnie Sahlberg	5aa5f3e7bf	Remove the structure ctdb_control_tcp_vnn since this is identical to the structure ctdb_tcp_connection. Add a new "ctdb deltickle" command to delete tickles from the database. This can ONLY be used for tickles created by "ctdb addtickle". Push any "addtickle/deltickle" updates to other nodes every TickleUpdateInterval seconds' (This used to be ctdb commit acded034e2f0dcae4c2c9e54e16a001caf23caec)	2010-08-18 12:36:03 +10:00
Rusty Russell	9fbb191b78	logging: give a unique logging name to each forked child. This means we can distinguish which child is logging, esp. via syslog where we have no pid. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 68b3761a0874429b90731741f0531f76dcfbb081)	2010-08-18 11:46:32 +09:30
Rusty Russell	af55c910a4	freeze: abort vacuuming when we're going to freeze. There are some reports of freeze timeouts, and it looks like vacuuming might be the culprit. So we add code to tell them to abort when a freeze is going on. (This is based on the 1.0.112 branch version 517f05e42f, but far simpler since tdb is now robust against processes being killed during transaction commit) CQ:S1018154 & S1018349 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit f5d7dc679501e607c2c83a248a89d3cada9df146)	2010-08-18 10:54:28 +09:30
Ronnie Sahlberg	ddf3c621c1	Merge commit 'rusty/libctdb-new' into foo (This used to be ctdb commit 1566d2d23ab698896b3b6a76974a5c7452db4a62)	2010-08-18 09:53:52 +10:00
Rusty Russell	f93440c4b7	event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version `7f29f817fa`. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)	2010-08-18 09:16:31 +09:30
Rusty Russell	a65cb6a9ae	libctdb: add synchronous message handling and unregister, with tests. It turns out that we do want a separate private arg for the message handler and the completion callback, so we change that. We also fix the prototypes of the remove_message functions as we implement them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 332375246eccd95da626f434f6d49dd9458a9787)	2010-08-09 15:41:32 +09:30
Ronnie Sahlberg	c5de7cfb8c	Merge commit 'rusty/master' (This used to be ctdb commit b4391c00476cde74101736986dfcd2be6c959edc)	2010-07-30 16:25:40 +10:00
Evan Kinney	0557c418e3	ctdb: Fixed use of reserved word "private" in typedefs In include/ctdb.h, ctdb_callback_t and ctdb_rrl_callback_t were defined with a void private variable. The variable name was changed to void private_data to avoid issues encountered in the Samba autoconf script. Evan Kinney <evan.kinney@sas.com> (This used to be ctdb commit 1f453aa4b5e749468c7788afac09c6f0900ea18f)	2010-07-29 17:16:36 +10:00
Rusty Russell	7061ceffd8	Report client for queue errors. We've been seeing "Invalid packet of length 0" errors, but we don't know what is sending them. Add a name for each queue, and print nread. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit e6cf0e8f14f4263fbd8b995418909199924827e9)	2010-07-01 23:08:49 +10:00
Rusty Russell	8946028a07	speed startup: add --sloppy-start. The extra recovery interval wait was introduced in 821333afb458 but no explanation was provided in that message. Nonetheless, if starting the entire cluster for the first time, it should be safe to skip this. We use the commandline arg --sloppy-start which should discourage people from using it outside testing. Seconds between ctdbd first log message and node healthy: BEFORE: 16.10 AFTER: 4.03 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 509e2e89ae233a0e91998d95267bf62f296a73cd)	2010-06-22 22:52:34 +09:30
Rusty Russell	cfe0edc0b9	libctdb: implement synchronous readrecordlock interface. Because this doesn't use a generic callback, it's not quite as trivial as the other sync wrappers. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 1f20b938d46d4fcd50d2b473c1ab8dc31d178d2d)	2010-06-21 14:47:34 +09:30
Rusty Russell	b93e65eaf7	libctdb: implement ctdb_disconnect and ctdb_detachdb These are important for testing, since we can easily tell if we leak memory if there are outstanding allocations after calling these. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 18a212aa40d0ff9ff59775c6fcf9dc973e991460)	2010-06-18 15:35:52 +09:30
Rusty Russell	5f9e4b60ae	Delay reusing ids to make protocol more robust Ronnie and I tracked down a bug which seems to be caused by a node running so slowly that we timed out the request and reused the request id before it responded. The result was that we unlocked the wrong record, leading to the following: ctdbd: tdb_unlock: count is 0 ctdbd: tdb_chainunlock failed smbd[1630912]: [2010/06/08 15:32:28.251716, 0] lib/util_sock.c:1491(get_peer_addr_internal) ctdbd: Could not find idr:43 ctdbd: server/ctdb_call.c:492 reqid 43 not found This exact problem is now detected, but in general we want to delay id reuse as long as possible to make our system more robust. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 9eb9c53ef29f4871ae2fe62fc5cb6145fca89eed)	2010-06-10 08:58:55 +09:30
Rusty Russell	7589b58138	libctdb: more bool conversion, and accompany lock by ctdb_db in API I missed some int->bool conversions previously, particularly the return of ctdb_writerecord(). By always handing functions ctdb_connection or ctdb_db, we keep it consistent with the rest of the API and can do extra lock consistency checks. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 3f939956ddd693cba6ea5c655288f4f5ca95f768)	2010-06-08 17:11:40 +09:30
Rusty Russell	866cca9637	libctdb: clarify logging levels Now we have more messages, it seems to make sense to document their usage and make them consistent. In particular, LOG_CRIT for internal libctdb problems, LOG_ALERT for API misuse. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit a6fed3f577c7ec51df38ed15ecb9db6ea2ae7c8f)	2010-06-08 16:53:17 +09:30
Ronnie Sahlberg	b9e5c8a47b	Split ctdb_release_lock() into a function to release the locvk and another function to free the data structures. This allows us to keep the datastructure valid after the lock has been released by the application and we can trap and warn when the application is accessing the lock after it has been released. I.e. application bugs. (This used to be ctdb commit 463a266205f145cd9c4c36b9c59d3747eeef0e2e)	2010-06-05 15:38:11 +10:00
Rusty Russell	3510980049	libctdb: documentation Full documentation for all the functions. This looks longer than it is, because it sorts them into async and sync parts, and also renames some formal parameters. Added TODO to libctdb directory to track our plans. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 108e9c2450876a9f8821aa7efd5be971eee5afd3)	2010-06-04 20:30:08 +09:30
Rusty Russell	c5b4768816	libctdb: use values from ctdb_protocol.h, don't re-declare We're best off including ctdb_protocol.h to get these, even if we document the important ones in ctdb.h. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit cdc19dc73032470d57f38bf825d8113b3a0c8cd1)	2010-06-04 20:22:03 +09:30
Rusty Russell	3a569c14bc	libctdb: use bool in API Return bool instead of -1/0; that's what the young kids are doing these days! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit e285b5d5a9d4fbc4f75dbb237d2fcdbd84f2d605)	2010-06-04 20:19:25 +09:30
Rusty Russell	379fd4e606	libctdb: add logging infrastructure This is based on Ronnie's work, merged with mine. That means errors are all my fault. Differences from Ronnie's: 1) use syslog's LOG_ levels directly. 2) typesafe arg to log function, and use it (eg stderr) in helper function. 3) store fn in ctdb context, and expose ctdb_log_level directly thru API. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 86259aa395555aaf7b2fae7326caa2ea62961092)	2010-06-04 20:27:03 +09:30
Rusty Russell	cc8435852c	libctdb: add ctdb arg to more functions. This is going to help for logging, since we want it there. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 0786152472bc43efae4c896f7c6c07c6e080b9b2)	2010-06-04 16:54:08 +09:30
Rusty Russell	94df6f322d	libctdb: change callback for ctdb_readrecordlock. After discussion with Ronnie, we decided to revisit this interface. We use the name ctdb_readrecordlock_async, as it is not always a send, and we use a specific callback to avoid the "fake request" creation on the fast path. The request itself is never exposed: this means it can't be cancelled, but we can revisit that later if need be. This makes both use and implementation simpler. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 03b5546ae45a60ab41eb4f7159a45bfdbf959888)	2010-06-04 13:33:08 +09:30
Ronnie Sahlberg	2d4b98381f	ctdb_req_control contains 4 padding bytes. Create an explicit pad variable here and set it to 0 when creating a control to keep valgrind happy. PDUs are padded to 8 byte boundary. If padding is used, memset it to 0 to keep valgrind happy. (This used to be ctdb commit 8818d5c483558c0faa6a3923ed5e675fdcfc13af)	2010-06-02 16:49:05 +10:00
Ronnie Sahlberg	f94291c37d	Make the call to free the request explicit in the callback instead of implicit (This used to be ctdb commit 573e4e2d2bd09dd9579150cce926de774a0b609c)	2010-06-02 13:49:34 +10:00
Ronnie Sahlberg	53ea238c6c	Add a variable for start/current time to ctdb statistics and print the time startistics was taken and for how long the statistics have been collected to the "ctdb statistics" output. (This used to be ctdb commit 1bdfe0cd3370a335b960ce1ef97eade93b0cd2fa)	2010-06-02 13:14:53 +10:00
Ronnie Sahlberg	8666094e92	add a function to read the current socketname from the ctdb structure (This used to be ctdb commit 112d252b2ab614eeac38e4a1658cd1e85f6eb829)	2010-06-02 10:25:31 +10:00
Ronnie Sahlberg	3c7350b8c6	rename ctdb_remove_message_handler to ctdb_client_remove_message_handler to avoid conflict with the function of the same name in libctdb (This used to be ctdb commit 636ed76d04c8c499a911eb0d72d54b71b0a73d31)	2010-06-02 10:05:58 +10:00
Ronnie Sahlberg	f1b8bd94bb	rename ctdb_message_fn_t to ctdb_msg_fn_t to avoid a conflict with the type of the same name used in libctdb (This used to be ctdb commit 49e23f8329649e4d9eefab47c9b158fcc7210d07)	2010-06-02 10:00:58 +10:00
Ronnie Sahlberg	bc208bc916	rename ctdb_set_message_handler to ctdb_client_set_message_handler to avoid a colission with the function of the same name in libctdb (This used to be ctdb commit 41dbdd4fc0ab560420fb0e24a3179ff7c94c5bb7)	2010-06-02 09:51:47 +10:00
Ronnie Sahlberg	761a075de9	rename ctdb_send_message to ctdb_client_send_message to resolve colission with the function of the same name in libctdb (This used to be ctdb commit ac3292c12832484a22715f1d46aa23f3b7c8a6f6)	2010-06-02 09:45:21 +10:00
Ronnie Sahlberg	bdbf7077e8	rename ccan/typesafe_cb.h to ctdb_typesafe_cb.h and add this file to the install/rpm (This used to be ctdb commit 96f186240a17386de1e02eb3af392d97bb55a1ae)	2010-06-02 09:18:48 +10:00
Rusty Russell	bd8d302589	libctdb: tweak interface for readrecordlock Previously we could hang in poll with the callback pending (since we fake it): explicitly call it immediately. Note: I experienced corruption using DLIST_ADD_END (ctdb->pnn was blatted when adding to the message_handler list). I switched them all to DLIST_ADD, but maybe I'm using it wrong? (This used to be ctdb commit 3727165f0d206999d2cfc2800ff8868640868c7c)	2010-05-24 13:52:17 +09:30
Rusty Russell	30f4d01df1	libctdb: uniform callbacks, _recv functions to pull out data. This is a bit tricky for those cases where we need to do multiple or zero I/Os (eg. attachdb and readrecordlock), but works well for the simple cases. (This used to be ctdb commit ebe4dd724338c156423cfdcc10a75b68c2084cde)	2010-05-24 13:17:36 +09:30
Rusty Russell	7046a1ad0a	libctdb: API changes from Ronnie's version These simplifications mostly came up due to the implementation. o Rename ctdb_context to ctdb_connection. We already have a ctdb_context internally in ctdbd; don't confuse them! o Rename ctdb_handle to struct ctdb_request. From the user POV it's a request, and it's also useful internally to avoid implicit cast to/from void *. o Rename ctdb_db_context to ctdb_db. o Introduce ctdb_lock. This provides an explicit "lock object" you get from readrecordlock and have to hand to those functions which need you to hold a lock. o status args are "int" not int32_t. Should this be a bool? o Remove last traces on generic callback. Without semi-sync API, this doesn't help anything and loses type safety. o Remove the semi-async API. We can add this later, but I think a sync and async API is enough for our poor users for the moment :) o Registering a message handler also takes a callback. This way you can tell if it failed. Not sure if this is overkill, but it's consistent. o ctdb_service() takes an revents arg Strictly not necessary for a nonblocking fd, but nice to know if a read or write is possible. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 86e1f93df856f9627182ed0e18bfcff6866c0954)	2010-05-20 16:07:30 +09:30
Rusty Russell	bbfb992f55	libctdb: ctdb.h and tst.c from Ronnie This imports ctdb.h and tst.c from Ronnie's work: it's a separate commit for now to make the changes obvious. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 09f05cbfc883e5aac33d3781b163cde178ece4cf)	2010-05-20 16:01:28 +09:30
Rusty Russell	d5f6026a22	libctdb: reorganize headers: remove ctdb.h, add ctdb_client.h and ctdb_protocol.h ctdb_client.h is the existing internal client interface (which was mainly in ctdb.h), and ctdb_protocol.h is the information needed for the wire protocol only. ctdb.h will be the new, shiny, libctdb API. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 4bba6b8cd47b352f98d41f9f06258d5ac3c9adef)	2010-05-20 15:18:30 +09:30
Ronnie Sahlberg	6f1221e9e1	Add the number of performed recoveries to the "ctdb statistics" output. (This used to be ctdb commit fa045733cb81412f0d02ab52d74eabc7efca8b3d)	2010-05-11 09:44:53 +10:00
Rusty Russell	72c275dd70	ctdb: use full range of IDR This resolves a problem with huge numbers of requests which could overflow 16 bits. Fortunately, the IDR should scale reasonably well, so we can simply hold all the requests. Although noone checks for failure, I added a constant for that. BZ: 60540 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 72efc4122e37798227c3420a65ed1f706ca9ebe7)	2010-05-11 09:44:43 +10:00
Ronnie sahlberg	46f00a2478	Merge commit 'rusty/signal-fix' (This used to be ctdb commit 221a9bb41c3a7af0cc65cda78365010893ca1430)	2010-05-03 15:57:41 +10:00
Ronnie Sahlberg	4a43428440	The recent change to the recovery daemon to keep track of and verify that all nodes agree on the most recent ip address assignments broke "ctdb moveip ..." since that call would never trigger a full takeover run and thus would immediately trigger an inconsistency. Add a new message to the recovery daemon where we can tell the recovery daemon to update its assignments. BZ62782 (This used to be ctdb commit e7069082e5f0380dcddee247db8754218ce18cab)	2010-05-03 15:47:17 +10:00
Rusty Russell	e1b59b6a47	eventscript: don't do debugging system() from inside signal handler In the case of a timeout, we dump a log of what's happening to a file in /tmp. We do it from the signal handler, which is an unreliable hack (BZ58365). Instead, create another (lower-priority) child to do the dump, then kill the timedout script. Note that this doesn't quite work as intended (the dump is often run after the script has been killed), so the next patch resolves this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 7ee5ecc8d53e78e2dec21197b74a74cc4ae1834c)	2010-04-08 15:13:29 +09:30
Ronnie Sahlberg	06885ea9a7	In the recovery daemon, keep track of which node we have assigned public ip addresses and verify that the remote nodes have/keep a consistent view of assigned addresses. If a remote node has an inconsistent view of addresses visavi the recovery master this will trigger a full ip reallocation. (This used to be ctdb commit f3bf2ab61f8dbbc806ec23a68a87aaedd458e712)	2010-04-08 14:25:26 +10:00
Stefan Metzmacher	3419e9c4dd	server: add "setup" event This is needed because the "init" event can't use 'ctdb' commands. metze (This used to be ctdb commit 1493436b6b24eb05a23b7a339071ad85f70de8f4)	2010-02-23 10:38:49 +01:00
Stefan Metzmacher	98ee69c66d	server: add updateip event metze (This used to be ctdb commit 712ed0c4c0bff1be9e96a54b62512787a4aa6259)	2010-01-20 11:11:01 +01:00
Stefan Metzmacher	32d00d0a0d	controls: add stups for GET_PUBLIC_IP_INFO, GET_IFACES and SET_IFACE_LINK_STATE metze (This used to be ctdb commit a2c9e4578e149eccb2c6183f64a6b657eb95c5e1)	2010-01-20 11:10:59 +01:00
Stefan Metzmacher	37880b0d0a	server: use CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE during a takeover run We know ask for the known and available interfaces. This means a node gets a RELEASE_IP event for all interfaces it "knows", but doesn't serve and a node only gets a TAKE_IP event for "available" interfaces. metze (This used to be ctdb commit a695a38e49e7c3e15a9706392dc920eeab1f11ba)	2010-01-20 11:10:59 +01:00
Stefan Metzmacher	b9f6afe4b0	client: add CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE ctdb_ctrl_get_public_ips_flags() metze (This used to be ctdb commit 6bd780510058e5589f2f7c3722d37acbba4935ab)	2010-01-20 11:10:58 +01:00
Stefan Metzmacher	15616d3271	reserve upper bits in ctdb_control->flags for opcode specific flags metze (This used to be ctdb commit 91122c322fbec08138b92c528d9a946f6727b4fd)	2010-01-20 11:10:58 +01:00
Stefan Metzmacher	bea53c60b8	server: keep the interface information in a list of ctdb_iface structures metze (This used to be ctdb commit ff5291778f0752e176539397e9530dcf0e546bea)	2010-01-20 11:10:58 +01:00
Stefan Metzmacher	a1da4e05b5	server: allow multiple interfaces comma separated in public_addresses metze (This used to be ctdb commit 33a00ef7233051acdbc66410130ec5d876a8422f)	2010-01-20 11:10:58 +01:00
Stefan Metzmacher	bec35e6441	server: add a ctdb_set_single_public_ip() helper function metze (This used to be ctdb commit 400b4806c4a9686a2ee6398b5d7c3e0ca0793fd1)	2010-01-20 11:10:57 +01:00
Stefan Metzmacher	fd06167caa	server: add "init" event This is needed because the "startup" event runs after the initial recovery, but we need to do some actions before the initial recovery. metze (This used to be ctdb commit e953808449c102258abb6cba6f4abf486dda3b82)	2010-01-20 09:44:36 +01:00
Stefan Metzmacher	9cba540514	lib/util: import fault/backtrace handling from samba. metze (This used to be ctdb commit 8171d66f0061fe23ed6dfef87ffe63bfc19596eb)	2010-01-20 09:44:36 +01:00
Stefan Metzmacher	a309287947	move DEBUG* macros to one place metze (This used to be ctdb commit 4b4dd5d7f81bf226e05c7f3d40087043da1517a2)	2010-01-20 09:44:36 +01:00
Ronnie Sahlberg	a1d60b1511	Make the size of the in memory ringbuffer for keeping the recent log messages configureable using --log-ringbuf-size=<num-entries>. Add an entry in the sysconfig file to set this persistently. (This used to be ctdb commit c79c2da69bc352f509e7fca4b9172a4b7f23c0f8)	2010-01-15 15:38:56 +11:00
Ronnie Sahlberg	4c722fe34c	fix a conflict in the merge from rusty Merge commit 'rusty/ctdb-no-setsched' Conflicts: server/ctdb_vacuum.c (This used to be ctdb commit b4365045797f520a7914afdb69ebd1a8dacfa0d9)	2009-12-17 08:18:04 +11:00
Rusty Russell	af2613e16f	ctdb: use mlockall, cautiously We don't want ctdb stalling due to paging; this can be far worse than scheduling delays. But if we simply do mlockall(MCL_FUTURE), it increases the risk that mmap (ie. tdb open) or malloc will fail, causing us to abort. This patch is a compromise: we mlock all current pages (including 10k of future stack for expansion) and then relock when a client asks us to open a TDB. We warn, but don't exit, if it fails. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 82f778e85440bc713d3f87c08ddc955d3cfce926)	2009-12-16 20:57:20 +10:30
Rusty Russell	c488ba440a	Remove RT priority, use niceness. 1) It's buggy. Code needs to be carefully written (ie. no busy loops) to handle running with it, and we fork and run scripts.[1] 2) It makes debugging harder. If ctdbd loops (as has happened recently) it can be extremely hard to get in and see what's happening. We've already seen the valgrind hacks. 3) We have seen recent scheduler problems. Perhaps they are unrelated, but removing this very unusual setup is unlikely to hurt. 4) It doesn't make anything faster. Under all but the most perverse of circumstances, 99% of the cpu gives the same performance as 100%, and we will always preempt normal processes anyway. [1] I made this worse in 0fafdcb8d353 "eventscript: fork() a child for each script" by removing the switch_from_server_to_client() which restored it, but even that was only for monitor scripts. Others were run with RT priority. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 482c302d46e2162d0cf552f8456bc49573ae729d)	2009-12-16 19:26:22 +10:30
Rusty Russell	f148735928	Add --valgringing flag instead of --nosetsched The do_setsched was being tested for whether to mmap tdbs: let's make it explicit. We can also happily move the kill-child eventscript hack under this flag. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 2ee86cc1f311d7b7504c7b14d142b9c4f6f4b469)	2009-12-16 20:59:15 +10:30
Stefan Metzmacher	aa658b6777	client: make ctdb_dumpdb_record() public metze (This used to be ctdb commit 1cdc8dbb9cb971cf6dd6cd22b1adaf70ddc77e65)	2009-12-16 08:08:32 +01:00
Stefan Metzmacher	0e436b46c6	client: add ctdb_ctrl_getdbhealth() metze (This used to be ctdb commit 5abe44d0113839d3a45c9a31d30856aa70c2ea1f)	2009-12-16 08:08:32 +01:00
Stefan Metzmacher	f1f0af2b67	server: add CTDB_CONTROL_DB_SET_HEALTHY and CTDB_CONTROL_DB_GET_HEALTH metze (This used to be ctdb commit 7332d900538f0cbcd953a723417a0fe31dc9807c)	2009-12-16 08:08:29 +01:00
Stefan Metzmacher	94bc40307a	server: Use tdb_check to verify persistent tdbs on startup Depending on --max-persistent-check-errors we allow ctdb to start with unhealthy persistent databases. The default is 0 which means to reject a startup with unhealthy dbs. The health of the persistent databases is checked after each recovery. Node monitoring and the "startup" is deferred until all persistent databases are healthy. Databases can become healthy automaticly by a completely HEALTHY node joining the cluster. Or by an administrator with "ctdb backupdb/restoredb" or "ctdb wipedb". metze (This used to be ctdb commit 15f133d5150ed1badb4fef7d644f10cd08a25cb5)	2009-12-16 08:06:10 +01:00
Stefan Metzmacher	b74918b465	server: open /var/ctdb/state/persistent_health.tdb.X on startup This node internal tdb will store the HEALTH state of persistent tdbs. metze (This used to be ctdb commit cbda4666be88c11a810a192a70667b57f773ace1)	2009-12-16 08:03:56 +01:00
Stefan Metzmacher	9a96ae0c97	server: only do the mkdir() calls for db_directory* once at the start metze (This used to be ctdb commit f30f33685db50860b6cd6fd1b6bdc3066620a78f)	2009-12-16 08:03:56 +01:00
Stefan Metzmacher	b48228e7f9	server: add db_directory_state to ctdb_context metze (This used to be ctdb commit 656a6ec5ed81ccfbb86144156a3158e48f105ee4)	2009-12-16 08:03:55 +01:00
Ronnie Sahlberg	640c48c844	Revert "cleanup: remove a tunable we no longer use in the eventscripts any more :" This reverts commit 401f421fa003d9515df15e759b50b56e0c67d69c. Conflicts: include/ctdb_private.h server/ctdb_tunables.c (This used to be ctdb commit b883d19a495a41a22db37f9c2cf6250fee529de0)	2009-12-16 09:51:17 +11:00
Ronnie Sahlberg	0982299bed	Revert "Make fetch_locked more scalable" This reverts commit 5736e17c139c9a8049e235429aeae0c6c9d0e93d. (This used to be ctdb commit 3d2d877d877146ca09a28a3a44f4840eb36fd377)	2009-12-15 14:26:28 +11:00
Ronnie Sahlberg	5a7e9900df	Merge commit 'obnox/ctdb-wip-trans3' into trans3 (This used to be ctdb commit ac06a0e042e7d024060d6e87a49bda9ccc072c52)	2009-12-15 14:25:55 +11:00
Ronnie Sahlberg	649ba2631d	Rename the tunable EventScriptBanCount to EventScriptTimeoutCount since we no longer ban nodes when dodgy scripts continue to hang. We now only mark nodes as unhealthy if monitor events fail or timeout. Never ban. (This used to be ctdb commit 5c8e56fc7a518e115bceac257867739283cf6a1e)	2009-12-14 15:53:23 +11:00
Ronnie Sahlberg	ed6b5a8c68	cleanup: remove a tunable we no longer use in the eventscripts any more : EventScriptUnhealthyOnTimeout (This used to be ctdb commit 401f421fa003d9515df15e759b50b56e0c67d69c)	2009-12-14 15:48:47 +11:00
Ronnie Sahlberg	e76561f544	remove the variable "disable when unhealthy" there is no rational need for a setting where we permanently mark nodes as disabled everytime an eventscript fails (This used to be ctdb commit 68a8ee99b128a5ec883600735626bdb3bbc9c503)	2009-12-14 15:40:54 +11:00
Volker Lendecke	f6ea3e6bcf	Make fetch_locked more scalable This patch improves the handling of the fetch_lock operation on non-persistent databases that ctdb clients have to do very frequently. The normal flow how this goes is the following: 1. Client does a local fetch_lock on the database 2. Client looks if the local node is dmaster. If yes, everything is fine If no, continue here 3. Client unlocks the local record 4. Client issues a "get me the record" call to ctdbd 5. ctdbd goes out and fetches the dmaster role 6. ctdbd tells the client to retry 7. Client starts over again The problem is between step 6 and 7: Before the client has had the chance to retry (i.e. catch the record with a fetch_locked), another node might have come asking ctdbd to migrate away the record again. This is a real problem, I've seen >20 loops of this kind in real workloads. This patch does the following: Whenever ctdb receives a record as result of step 5, it puts the key on a "holdback list". As long as a key is on this list, a request to migrate away the dmaster is put on hold. It is the client's duty to issue the "CTDB_CONTROL_GOTIT" control when it has successfully done step 2 after having asked ctdb to fetch the record. This will release the key from the "holdback list" and re-issue all dmaster migration requests. As a safeguard against malicious clients, once a second (default 1000msecs, tunable "HoldbackCleanupInterval" in milliseconds) ctdbd goes over the list of held back keys, deletes them and releases all held back migration requests. (This used to be ctdb commit 5736e17c139c9a8049e235429aeae0c6c9d0e93d)	2009-12-12 00:45:39 +01:00
Michael Adam	46de365e78	Add a new control CTDB_GET_DB_SEQNUM - fetch a persistent db's sequence number. Michael (This used to be ctdb commit a7e3b5fac6b3f5d74473f26eb86c067b35647996)	2009-12-12 00:45:39 +01:00
Michael Adam	8dedde81cd	define CTDB_DB_SEQNUM_KEY - used with the new implementation of transactions. Michael (This used to be ctdb commit 4b1dbcf0853bdc4832d39a477823ae34f216da52)	2009-12-12 00:45:38 +01:00
Volker Lendecke	24d04a3e89	Rename a struct member for clarity (This used to be ctdb commit 6af5e74a21546d723008d69d6752ebebf898c947)	2009-12-12 00:45:37 +01:00
Michael Adam	faacd5ca79	server: add a new control CTDB_CONTROL_TRANS3_COMMIT This is a simplified version of the trans2 commit control: It just rolls out the marshall buffer to all active nodes. It is the main ctdbd part of the re-implementation of the persistent transactions. The client code is changed to take a global lock to start a transactions and store into the marshal buffer instead of writing to the local tdb under a local transaction. The old transaction implementation is going to be removed in a later commit. Michael (This used to be ctdb commit f66428f9d2013080a414404c1ba6117888352fd6)	2009-12-12 00:43:26 +01:00
Ronnie Sahlberg	a8549ef700	From: Volker Lendecke <vl@samba.org> Date: Wed, 9 Dec 2009 22:45:12 +0100 Subject: [PATCH] Revert an accidential commit (This used to be ctdb commit af6656f2844d8fd72204a70358c9d589dbe1bd34)	2009-12-10 08:53:55 +11:00
Volker Lendecke	a0d9bd3c13	Run only one event for each epoll_wait/select call This might be a bit less efficient, but experience in winbind has shown that event callbacks can trigger changes in the socket state in very hard to diagnose ways. (This used to be ctdb commit a78b8ea7168e5fdb2d62379ad3112008b2748576)	2009-12-10 07:52:16 +11:00
Rusty Russell	a46c3b4f2a	ctdb: scriptstatus can now query non-monitor events We also no longer return an error before scripts have been run; a special zero-length data means we have never run the scripts. "ctdb scriptstatus all" returns all event script results. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 9b90d671581e390e2892d3a68f3ca98d58bef4df)	2009-12-08 01:50:55 +10:30
Rusty Russell	5d99a1a47c	eventscript: expost call names and enum We're going to need this so ctdb can query non-monitor status. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 53bc5ca23ca55a3ac63a440051f16716944a2a51)	2009-12-08 01:47:13 +10:30
Rusty Russell	d3593c2f83	eventscript: save state for all script invocations Rather than only tranferring to last_status for monitor events, do it for every event (ctdb->last_status is now an array). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit c73ea56275d4be76f7ed983d7565b20237dbdce3)	2009-12-08 12:27:48 +10:30
Rusty Russell	9753b7e793	eventscript: rename ctdb_monitoring_wire to ctdb_scripts_wire We're going to allow fetching status of all script runs, so this name is no longer appropriate. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit f5cb41ecf3fa986b8af243e8546eb3b985cd902a)	2009-12-08 00:51:24 +10:30
Rusty Russell	23e24c503c	eventscript: ctdb_fork_with_logging() A new helper functions which sets up an event attached to the child's stdout/stderr which gets routed to the logging callback after being placed in the normal logs. This is a generalization of the previous code which was hardcoded to call ctdb_log_event_script_output. The only subtlety is that we hang the child fds off the output buffer; the destructor for that will flush, which means it has to be destroyed before the output buffer is. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 32cfdc3aec34272612f43a3588e4cabed9c85b68)	2009-12-08 12:44:30 +10:30
Rusty Russell	c309d22f9a	eventscript: remove unused ctbd_ctrl_event_script* The child no longer uses ctdb_ctrl_event_script_init or ctdb_ctrl_event_script_finished, and the others are redundant: it doesn't need to tell us it's starting a script when it only runs one. We move start and stop calls to the parent, and eliminate the RPC infrastructure altogether. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 391926a87a7af73840f10bb314c0a2f951a0854c)	2009-12-08 00:27:40 +10:30
Rusty Russell	b8e347ec9c	eventscript: use direct script state pointer for current monitor We put a "scripts" member in ctdb_event_script_state, rather than using a special struct for monitor events. This will fit better as we further unify the different events, and holds the reports from the child process running each monitor script. Rather than making the monitor state a child of current_monitor_status_ctx, we just point current_monitor directly at it. This means we need to reset that pointer in the destructor for ctdb_event_script_state. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 9a2b4f6b17e54685f878d75bad27aa5090b4571f)	2009-12-08 00:14:01 +10:30
Rusty Russell	a4c2a98ba9	eventscript: make current_monitor_status_ctx serve as monitor_event_script_ctx We have monitor_event_script_ctx and other_event_script_ctx, and current_monitor_status_ctx in struct ctdb_context. This seems more complex than it needs to be. We use a single "event_script_ctx" as parent for all event script state structures. Then we explicitly reparent monitor events under current_monitor_status_ctx: this is freed every script invocation to kill off any running scripts anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 0d925e6f2767691fa561f15bbb857a2aec531143)	2009-12-08 00:09:20 +10:30
Rusty Russell	5190932507	eventscript: expost ctdb_ban_self() eventscript.c uses this now, but our next patch makes others use it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit a305cb7743c24386e464f6b2efab7e2108bb1e7e)	2009-12-07 23:18:40 +10:30
Rusty Russell	b9b75bd065	eventscript: use -ENOEXEC for disabled status value This unifies code paths and simplifies things: we just hand -ENOEXEC to ctdb_ctrl_event_script_stop(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit eadf5e44ef97d7703a7d3bce0e7ea0f21cb11f14)	2009-12-07 23:11:47 +10:30
Rusty Russell	066a791770	eventscript: use -ETIME for timeout status value This starts the move toward more expressive encoding of return values: positive values mean the script ran, negative means we had a problem with the script (and the value is the errno). This does timeout, but changes the ctdb tool to recognize it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 0eb1d0aa14e68b598d9e281c8a02b8f94a042fd9)	2009-12-07 23:09:42 +10:30
Rusty Russell	85a6f4a4dd	eventscript: marshall onto last_status immediately This simplifies the code a little: last_status is now read to go (it's only used by the scriptstatus command at the moment). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 6be931266a4e41fd0253f760936ad9707dd97c47)	2009-12-07 23:09:40 +10:30
Michael Adam	0635f8b98f	make ctdb_ctrl_transaction_active public. Michael (This used to be ctdb commit e5496a83ef4a01604195b27c4b97f50d4979510e)	2009-12-04 11:30:22 +01:00
Ronnie Sahlberg	e28c652cca	Dont store debug level DEBUG_DEBUG in the in-memory ringbuffer. It is unlikely we will need something this verbose for normal troubleshooting. This allows us to keep a significantly longer time interval of log messages in the 500k slots available in the ringbuffer. (This used to be ctdb commit cc99c05c0c6484ad574039a454e6133852cb41fa)	2009-12-04 11:45:37 +11:00
Ronnie Sahlberg	6bad4a4836	Add a proper function to process a process-exist control in the daemon. This controls is only used by samba when samba wants to check if a subrecord held by a <node-id>:<smbd-pid> is still valid or if it can be reclaimed. If the node is banned or stopped, we kill the smbd process and return that the process does not exist to the caller. This allows us to recover subrecords from stopped/banned nodes where smbd is hung waiting for the databases to thaw. bz58185 (This used to be ctdb commit 157807af72ed4f7314afbc9c19756f9787b92c15)	2009-12-02 13:58:27 +11:00
Ronnie Sahlberg	1c7de7a2ed	Add a double linked list to the ctdb_context to store a mapping between client pids and client structures. Add the mapping to the list everytime we accept() a new client connection and set it up to remove in the destructor when the client structure is freed. (This used to be ctdb commit f75d379377f5d4abbff2576ddc5d58d91dc53bf4)	2009-12-02 13:41:04 +11:00
Ronnie Sahlberg	569001afd0	Merge commit 'martins/status-test-2' Conflicts: server/eventscript.c (This used to be ctdb commit e9b3477a5b9a2eff18f727e7d59338bfb5214793)	2009-12-01 10:53:18 +11:00
Martin Schwenke	a64ccf07c1	Add flag to ctdb_event_script_callback indicating when called by client. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a1d654a982ca56fade82552f4e6b5586236d3233)	2009-11-26 15:49:49 +11:00
Rusty Russell	3188df4a88	eventscript: check that ctdb forced script events correct Now we're doing checking, we might as well make sure the commands from "ctdb eventscripts" are valid. This gets rid of the "UNKNOWN" event type. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 1d24a3869fe89fc9a109fd9e9b69df5fc665a5f6)	2009-11-25 11:02:29 +10:30
Rusty Russell	2d9254404d	eventscript: introduce enum for different event script calls. Rather than doing strcmp everywhere, pass an explicit enum around. This also subtly documents what options are available. The "options" arg is now used for extra arguments only. Unfortunately, gcc complains on empty format strings, so we make ctdb_event_script() take no varargs, and add ctdb_event_script_args(). We leave ctdb_event_script_callback() taking varargs, which means callers have to do "%s", "". For the moment, we have CTDB_EVENT_UNKNOWN for handling forced scripts from the ctdb tool. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 8001488be4f2beb25e943fe01b2afc2e8779930d)	2009-11-24 11:16:49 +10:30
Rusty Russell	2763df22de	eventscript: put timeout inside ctdb_event_script_callback_v Everyone uses the same timeout value, so just remove it from the API. If we ever need variable timeouts, that might as well be central too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 533c3e053293941d2a9484b495e78d45f478bb08)	2009-11-24 11:09:46 +10:30
Ronnie Sahlberg	e6b69fa760	rework and simplify the eventscript handling This version has no trailing whitespace, and fixed (This used to be ctdb commit defbe318152fc479e8076ad70433cdb4971951af)	2009-11-25 11:00:11 +10:30
Ronnie Sahlberg	ae209c74c8	dont reset the event script context everytime we start a new "ctdb eventscript ..." command. Use the existing context used for non-monitor events Multiple concurrent uses of "ctdb eventscript ..." could otherwise lead to a SEGV (This used to be ctdb commit 80a8d728e9680040e00d24361dfc9367dd372a56)	2009-11-19 11:03:51 +11:00
Ronnie Sahlberg	cc2d81a77c	make the ringbuffer logging more efficient and marshall the data by writing to a tmpfile instead of continously talloc resizing a blob (This used to be ctdb commit 6427f0b68d60b556a023f64e15e156000ba6f943)	2009-11-18 19:10:50 +11:00
Ronnie Sahlberg	bc2675119d	add an in memory ringbuffer where we store the last 500000 log entries regardless of log level. add commandt to extract this in memory buffer and to clear it (This used to be ctdb commit 29d2ee8d9c6c6f36b2334480f646d6db209f370e)	2009-11-18 12:44:18 +11:00
Ronnie Sahlberg	61de178e0a	set up a pipe betweent he main daemon and the child we use for syslogling so that we can clean up the childprocess when we stop ctdbd (This used to be ctdb commit cb8df973ccd446d87fbdd9a27843e54841ba5d89)	2009-11-16 15:17:32 +11:00
Ronnie Sahlberg	93d902e8f7	test of a change to make ctdbd use "status" event instead of the "monitor" event. This allows running the actual monitoring asynchronously from ctdbd and only using "status" to pick up the actual results. (This used to be ctdb commit 1908bac812650ca25151051f5d86815e0b8ed319)	2009-11-13 12:37:55 +11:00
Ronnie Sahlberg	e33722a569	start the syslog child a little later, after we have forked and detached from the local shell (This used to be ctdb commit 9ffd54b73c0d64b67e8e736d7cb54490e77ffa78)	2009-10-30 19:39:11 +11:00
Ronnie Sahlberg	5d73f19418	create a child process to write to syslog. use a udp socket on the ctdbd port to send messages to teh syslog child process for loggign. we need this when syslog becomes "slow", like very slow, and on boxes where syslog is limited to 100 lines per second and starts to block after that (This used to be ctdb commit 1446f4c247310e2ff2d522055bd8927d1a78d017)	2009-10-30 18:53:17 +11:00
Michael Adam	0113744fec	server: trans2_active: don't report a transaction active on the node that performs the transaction Otherwise a node can lock itself out, e.g. when a commit control times out... Michael (This used to be ctdb commit cb432e30351d5e5a41e98da3c7b1c2a4d400a3a2)	2009-10-30 09:22:18 +11:00
Ronnie Sahlberg	023d09cd38	Revert "update the "uptime" command to indicate the "time since last" is the time since the last recovery OR failover." This reverts commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36. (This used to be ctdb commit cb36bbb5418290e8e5b770d2d836285b15da2a6f)	2009-10-29 10:49:00 +11:00
Ronnie Sahlberg	279b7ca564	update the "uptime" command to indicate the "time since last" is the time since the last recovery OR failover. (This used to be ctdb commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36)	2009-10-29 10:37:10 +11:00
Michael Adam	abac42ca34	server: add a new ctdb control CTDB_TRANS2_ACTIVE This aske the daemon wheter a transaction is currently active on a given DB on that node. More precisely this asks for the transaction_active flag in the ctdb_db_context that is set in the CTDB_TRANS2_COMMIT control and cleared in the CTDB_TRANS2_ERROR or CTDB_TRANS2_FINISHED controls. This will be useful for fixing race conditions in the transaction code. Michael (This used to be ctdb commit 8d430ae6968dfe566614379436fc3c56003fcd88)	2009-10-29 10:14:30 +11:00
Ronnie Sahlberg	d379b30182	create a separate context for non-monitor eventscripts so they dont collide (This used to be ctdb commit 325de818f88f339a16dc4544e899a2d735933c44)	2009-10-28 17:35:15 +11:00
Ronnie Sahlberg	e07ca41886	change the eventscript handling to allow EventScriptTimeout for each individual script isntead of for the entire set of scripts restructure the talloc hierarchy to allow this (This used to be ctdb commit 64da4402c6ad485f1d0a604878a7b0c01a0ea5f0)	2009-10-28 16:11:54 +11:00
Ronnie Sahlberg	4d40b86805	for debugging add a global variable holding the pid of the main daemon. change the tracking of time() in the event loop to only check/warn when called from the main daemon (This used to be ctdb commit a10fc51f4c30e85ada6d4b7347b0f9a8ebc76637)	2009-10-27 13:18:52 +11:00
Ronnie Sahlberg	86d1b4c465	Add a mechanism where we can register notifications to be sent out to a SRVID when the client disconnects. The way to use this is from a client to : 1, first create a message handle and bind it to a SRVID A special prefix for the srvid space has been set aside for samba : Only samba is allowed to use srvid's with the top 32 bits set like this. The lower 32 bits are for samba to use internally. 2, register a "notification" using the new control : CTDB_CONTROL_REGISTER_NOTIFY = 114, This control takes as indata a structure like this : struct ctdb_client_notify_register { uint64_t srvid; uint32_t len; uint8_t notify_data[1]; }; srvid is the srvid used in the space set aside above. len and notify_data is an arbitrary blob. When notifications are later sent out to all clients, this is the payload of that notification message. If a client has registered with control 114 and then disconnects from ctdbd, ctdbd will broadcast a message to that srvid to all nodes/listeners in the cluster. A client can resister itself with as many different srvid's it want, but this is handled through a linked list from the client structure so it mainly designed for "few notifications per client". 3, a client that no longer wants to have a notification set up can deregister using control CTDB_CONTROL_DEREGISTER_NOTIFY = 115, which takes this as arguments : struct ctdb_client_notify_deregister { uint64_t srvid; }; When a client deregisters, there will no longer be sent a message to all other clients when this client disconnects from ctdbd. (This used to be ctdb commit f1b6ee4a55cdca60f93d992f0431d91bf301af2c)	2009-10-23 15:24:51 +11:00
Ronnie Sahlberg	9b8c72c446	When clients have blocked, perhaps because the node is banned or stopped and the client is blocked trying to tdb_fetch() a record, make sure we dont queue up too many REQ_MESSAGES. Add a new tunable to control the maximum queue size we allow to a blocked client before we start discarding REQ_MESSAGES instead of queueing them for delivery. This avoids having queued up very very large number of MESSAGES that samba semds between eachother to nodes that are blocked/banned/stopped for extended periods . (This used to be ctdb commit f76d6fed8f9630450263b9fa4b5fdf3493fb1e11)	2009-10-21 15:20:55 +11:00
Ronnie Sahlberg	d788dd3627	From wolfgang Mueller Add a tuneable so that when scripts starts to hang/timeout, we can make the node unhealthy instead of banned (This used to be ctdb commit 2e9fc6f0609833c6d8146196011ef780669d615d)	2009-10-20 12:59:48 +11:00
Ronnie Sahlberg	80be59d35e	when we change state between healthy/unhealthy, make sure we ask the recovery master to perform an explicit ip reallocation. This is more reliable and faster than having the recovery dameon track these changes, and since we now have an explicit method to ask the recovery daemon to perform an explicit ip reallocation, we should use this. (This used to be ctdb commit 3807681e74f4bfe92befdae6ed616ff5f1a99880)	2009-10-14 11:59:16 +11:00
Ronnie Sahlberg	122c423b82	add a new control for explicitely cancelling recovery transactions, i.e. the transactions we start across all tdb databased during the recovery. this allows us to properly clean up and delete these tdb transactions on a recovery failure. (This used to be ctdb commit b2ce8b900a7d00944c84e0574fea5b371064a06d)	2009-10-12 16:48:05 +11:00
Ronnie Sahlberg	73c0adb029	initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2)	2009-10-12 12:08:39 +11:00
Ronnie Sahlberg	d4c98516a2	uptade the freeze/thaw commands to be able to send the requested database priority to freeze/thaw to the daemon. this is encoded in the srvid field of the request header (This used to be ctdb commit 0cb3d33caa42ed783e03bc825b181dde4cf63616)	2009-10-12 09:22:17 +11:00
Ronnie Sahlberg	3219f81710	add a control to read the db priority from a database (This used to be ctdb commit ca6d045e419f308f57e74d4c978907afb05ddb85)	2009-10-10 15:04:18 +11:00
Ronnie Sahlberg	6cf7d8e131	add a control to set a database priority. Let newly created databases default to priority 1. database priorities will be used to control in which order databases are locked during recovery in. (This used to be ctdb commit 67741c0ee01916d94cace8e9462ef02507e06078)	2009-10-10 14:26:09 +11:00
Ronnie Sahlberg	166b1c97b4	add a new message to ask the recovery daemon to temporarily disable checking ip address consistency. This is useful when we are moving addresses using moveip in the cluster since otherwise if we collide with the recovery daemons own check we could cause a recovery (This used to be ctdb commit 9c63858c0b22c81eaccb9865a414af0bbb2833d4)	2009-10-06 12:11:32 +11:00
Ronnie Sahlberg	71e4259150	add a new function to collect a list of all active nodes EXCEPT a certain node (This used to be ctdb commit be52954d921e7d443304cf49fbd488c619a9c4ec)	2009-10-06 10:52:31 +11:00
Ronnie Sahlberg	c971d934a9	From Wolfgang Mueller-Friedt Remove the explicit vacuum/repack commands from the 00.ctdb eventscript and implement this in the ctdb daemon. Combine vacuuming and repacking into one cheap read traverse to enumerate all candidate records and one write traverse that both repacks the database and also deletes the record locally where we are lmaster and where the records have already been deleted remotely. this code also adds initial autotuning heuristics for the vacuum intervals and how many records to delete in each iteration. minor stylish changes made by ronnie s (This used to be ctdb commit 95a3ee551241aa164967991fe5efe078e1714bde)	2009-09-29 13:27:19 +10:00
Ronnie Sahlberg	cda5f02c7c	new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a)	2009-09-04 02:20:39 +10:00
Ronnie Sahlberg	1cc79905ad	add new controls to make it possible to enable/disable individual eventscripts update scriptstatus output so it lists disabled scripts (This used to be ctdb commit 7e799b7523c9699bd65a8a8207f7e03d668b0b81)	2009-08-13 13:04:08 +10:00
Wolfgang Mueller-Friedt	16af87bf25	repack limit tunable Signed-off-by: Wolfgang Mueller-Friedt <wolfmuel@de.ibm.com> (This used to be ctdb commit a2768b0732f2ab2e3fafda55587bd2e99eedf0fa)	2009-07-29 13:30:39 +10:00
Ronnie Sahlberg	fb63d27e4e	initial part of new vacuuming patch. create some new fields for ctdb_db and tunables (This used to be ctdb commit 3a8e7d36cc42aedf4b7665364224140dcbfb3efa)	2009-07-29 13:25:43 +10:00
Michael Adam	4cd06a330e	Fix persistent transaction commit race condition. In ctdb_client.c:ctdb_transaction_commit(), after a failed TRANS2_COMMIT control call (for instance due to the 1-second being exceeded waiting for a busy node's reply), there is a 1-second gap between the transaction_cancel() and replay_transaction() calls in which there is no lock on the persistent db. And due to the lack of global state indicating that a transaction is in progress in ctdbd, other nodes may succeed to start transactions on the db in this gap and even worse work on top of the possibly already pushed changes. So the data diverges on the several nodes. This change fixes this by introducing global state for a transaction commit being active in the ctdb_db_context struct and in a db_id field in the client so that a client keeps track of _which_ tdb it as transaction commit running on. These data are set by ctdb upon entering the trans2_commit control and they are cleared in the trans2_error or trans2_finished controls. This makes it impossible to start a nother transaction or migrate a record to a different node while a transaction is active on a persistent tdb, including the retry loop. This approach is dead lock free and still allows recovery process to be started in the retry-gap between cancel and replay. Also note, that this solution does not require any change in the client side. This was debugged and developed together with Stefan Metzmacher <metze@samba.org> - thanks! Michael (This used to be ctdb commit f88103516e5ad723062fb95fcb07a128f1069d69)	2009-07-29 11:12:39 +10:00
Ronnie Sahlberg	37d68c58b8	add two commands : setlmasterrole and setrecmasterrole to enable/disable these capabilities at runtime (This used to be ctdb commit 51aaed0e9e42e901451292e8dd545297ab725a62)	2009-07-28 13:45:13 +10:00
Ronnie Sahlberg	72e2380e92	add a command "setnatgwstate {on\|off}" that can be used to indicate if this node is using natgw functionality or not. (This used to be ctdb commit 89a9bb29a60a6fb1fba55987e6cf0a4baa695e50)	2009-07-28 09:58:11 +10:00
Ronnie Sahlberg	e5e9fc48b1	create a new event : stopped. This event is called when a node is stopped and is used by eventscripts that need to do certain cleanup and removal of configuration or ip addresses or routing ... Note that a STOPPED node is considered "inactive" and as such will not be running the "recovered" event when the rest of the cluster has recovered. (This used to be ctdb commit 65e9309564611bf937ded3c74a79abff895d7c59)	2009-07-17 12:26:16 +10:00
Ronnie Sahlberg	88f3c40d9c	add two new controls, CTOP_NODE and CONTINUE_NODE that are used to stop/continue a node instead of using modflags messages (This used to be ctdb commit 54b4a02053a0f98f8c424e7f658890254023d39a)	2009-07-09 12:22:46 +10:00
Ronnie Sahlberg	66c8d4fb3d	make it possible to start the daemon in STOPPED mode (This used to be ctdb commit 866aa995dc029db6e510060e9e95a8ca149094ac)	2009-07-09 11:57:20 +10:00
Ronnie Sahlberg	9f0dc4b93b	Add a new node flag : STOPPED This node flag means the node is DISABLED and that all its public ip addresses are failed over, but also that it has been removed from the VNNmap. A STOPPED node should be in recovery mode active untill restarted using the continue command. Adding two new commands "ctdb stop" "ctdb continue" (This used to be ctdb commit d47dab1026deba0554f21282a59bd172209ea066)	2009-07-09 11:38:18 +10:00
Ronnie Sahlberg	289c58e9b6	add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216)	2009-07-02 13:00:26 +10:00
Ronnie Sahlberg	93026f4cbf	update the handling of debug levels so that we always can use a literal instead of a numeric value. validate the input values used and refuse setting the debug level to an unknown value (This used to be ctdb commit daec49cea1790bcc64599959faf2159dec2c5929)	2009-07-01 09:17:13 +10:00
Ronnie Sahlberg	5b235c3999	add a control to set the reclock file (This used to be ctdb commit 36cc2e586f03fa497ee9b06f3e6afc80219c4aaa)	2009-06-25 14:25:18 +10:00
Ronnie Sahlberg	2b253c094c	add a control to read the current reclock file from a node (This used to be ctdb commit ed6a4cbcdcbb4e0df83bec8be67c30288bf9bd41)	2009-06-25 12:17:19 +10:00
Ronnie Sahlberg	e6170b5389	add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343)	2009-06-01 14:18:34 +10:00
Sumit Bose	11988fc77a	structure member node_list_file is not used anywhere (This used to be ctdb commit 0e84ea23d1d998d4d4ac7d8a858b3d8294f056cb)	2009-05-21 11:16:43 +10:00
Sumit Bose	9171a7784c	structure member logfile is not used anywhere (This used to be ctdb commit 4f86c991812c2d0bddbe3de9a9906cf5df118cd4)	2009-05-21 11:15:43 +10:00
Ronnie Sahlberg	98a54c4675	Track how long it takes to take out the recovery lock from both the main dameon and also from the recovery daemon. Log this in "ctdb statistics". Also add a varaible "RecLockLatencyMs" that will log an error everytime it takes longer than this to access the reclock file. (This used to be ctdb commit 042377ed803bb8f7ca9d6ea1a387427b7b8ba45a)	2009-05-14 10:33:25 +10:00
root	af25fa38f3	fixed a problem with clients disconnecting during a traverse When a client (such as smbstatus) is killed, it may have outstanding traverse children on remote nodes. We need to catch the client disconnect in ctdbd and send a control to all nodes telling them to kill those outstanding traverse children. (This used to be ctdb commit f2fb2df4619a14f7f6c11f9132ee7d793028042c)	2009-05-06 07:32:25 +10:00
root	6793f077a8	Add a new variable VerifyRecoveryLock which can be used to disable the test that the recovery daemon holds the lock properly when performing a recovery (This used to be ctdb commit 329df9e47e6ca8ab5143985a999e68f37c6d88a5)	2009-05-01 01:17:59 +10:00
Ronnie Sahlberg	38ea6708dd	add a tuneable RecoveryDropAllIPs so it is possible to control after how long a node that has been stuck in recovery will wait until it will yield all public addresses. this now defaults to 60 seconds This is useful if a split brain occurs due to network partitioning since it will make sure that the "other half" of the cluster that does not contain the recovery master will eventually release all ips and thus avoiding a duplicate ip situation for the public addresses (This used to be ctdb commit 70f21428c9eec96bcc787be191e7478ad68956dc)	2009-04-24 18:28:08 +10:00
Ronnie Sahlberg	d94917ec49	Change the (dodgy) seqnumfrequency variable to have ms resolution instead of second resolution. Rename the variable to SeqnumInterval for 1, it is an interval and not a 1/interval unit 2, so that we catch when people use this old variable and can update the sysconfig file instead of silently changin semantics of this variable this is a real dodgy variable (This used to be ctdb commit 68eac459e5d2b6b534f72821036675ffe5d7a350)	2009-04-01 17:21:38 +11:00
Ronnie Sahlberg	297ab50173	remove a prototype for a function no longer used (This used to be ctdb commit 9ac9745ba9296d01e3b18148ae8c3240e51cf090)	2009-04-01 17:13:48 +11:00
Ronnie Sahlberg	ad40ee25f9	add a mechanism where the ctdb daemon will run a usercontrolled script when the node status changes to/from UNHEALTHY state. This would allow a sysadmin to set up ctdb to send an email/snmptrap/... when the status of the node changes. (This used to be ctdb commit ce534a83a05dbd40238e4eee0669d60ff396f935)	2009-03-31 14:23:31 +11:00
Ronnie Sahlberg	689f76f0b0	Merge branch 'obnox' (This used to be ctdb commit 972036a5d510fb9b399f1ee34a8861dee4221267)	2009-03-24 17:49:55 +11:00

... 5 6 7 8 9 ...

1043 Commits