samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-27 03:21:53 +03:00

Author	SHA1	Message	Date
Amitay Isaacs	4736486188	ctdb-daemon: Rename ctdb_mkdir_p_or_die to mkdir_p_or_die This function does not require ctdb context. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-06-12 05:40:10 +02:00
Amitay Isaacs	8c8ef5640e	ctdb-daemon: Rename ctdb_lockdown_memory to lockdown_memory Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-06-12 05:40:10 +02:00
Amitay Isaacs	22f71579a4	ctdb-daemon: Instead of passing ctdb context, pass valgrinding boolean Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-06-12 05:40:10 +02:00
Amitay Isaacs	da1a6a3d31	ctdb-common: Remove unused functions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-06-12 05:40:10 +02:00
Amitay Isaacs	3a9d375328	ctdb-common: Drop ctdb prefix from utility functions independent of ctdb Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-06-12 05:40:10 +02:00
Amitay Isaacs	5b580e5d65	ctdb-common: Changing scheduler policy does not require ctdb context Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-06-12 05:40:10 +02:00
Amitay Isaacs	19fcf6ff52	ctdb-common: No need to save previous scheduler priority When calling sched_setscheduler() with SCHED_OTHER, the only valid priority is 0. Nice value is "restored" anyway. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-06-12 05:40:10 +02:00
Amitay Isaacs	6edbbce887	ctdb-build: Move internal include files in a separate directory This will allow to build clustered samba with built-in ctdb tree rather than needing to install CTDB first. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2014-05-27 13:43:11 +02:00
Amitay Isaacs	463ea9e525	ctdb-recoverd: Detach database from recovery daemon As part of vacuuming, recoverd attaches to databases to migrate records. When detaching a database from main daemon, it should be removed from recovery daemon also. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Apr 23 17:05:45 CEST 2014 on sn-devel-104	2014-04-23 17:05:45 +02:00
Amitay Isaacs	ce18b3b00b	ctdb-client: Add client code to detach a database Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-04-14 03:52:39 +02:00
Amitay Isaacs	1c72842217	ctdb-daemon: Add control CTDB_CONTROL_DB_DETACH This detaches specified database from all the nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-04-14 03:52:39 +02:00
Amitay Isaacs	01de7818de	ctdb-daemon: Always update database priority cluster wide Database priority is a global property and all the nodes should have the priority set for the databases. Just setting priority on one node can lead to problems in the recovery as a database can be frozen at wrong priority and then freezing database would not succeed. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: David Disseldorp <ddiss@samba.org> Autobuild-User(master): David Disseldorp <ddiss@samba.org> Autobuild-Date(master): Mon Apr 7 14:06:26 CEST 2014 on sn-devel-104	2014-04-07 14:06:26 +02:00
Martin Schwenke	9b907536fb	ctdb/daemon: Make delete IP wait until the IP is released reloadips really expects deleted IPs to be released before completing. Otherwise the recovery daemon starts failing the local IP check. The races that follow can cause a node to be banned. To make the error handling simple, do the actual deletion in release_ip_callback(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-03-23 04:20:15 +01:00
Amitay Isaacs	cbffbb7c2f	ctdb-daemon: Do not run monitor event if any other event is already running Any currently running monitor events are cancelled if any other events are scheduled. However, this does not stop monitor events to be run when other events are already running. Keep track of the number of active events and schedule monitor event only if there are no active events. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-21 11:30:41 +11:00
Martin Schwenke	e6304d1e1a	ctdb/daemon: Untangle serialisation of 1st recovery -> startup -> monitor At the moment ctdb_check_healthy() is overloaded to wait until the first recovery is complete, handle the "startup" event and also actually handle monitoring. This is untidy and hard to follow. Instead, have the daemon explicitly wait for 1st recovery after the "setup" event. When first recovery is complete, schedule a function to handle the "startup" event. When the "startup" event succeeds then explicitly enable monitoring. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-01-17 17:59:41 +11:00
Amitay Isaacs	a92fd11ad1	ctdb-daemon: Remove ctdb_fork_with_logging() This function has been replaced with ctdb_vfork_with_logging(). Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Thu Jan 16 04:05:35 CET 2014 on sn-devel-104	2014-01-16 04:05:35 +01:00
Amitay Isaacs	2879404388	ctdb-daemon: Add ctdb_vfork_with_logging() This will be used to spawn lightweight helper processes to run eventscripts. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 11:41:12 +11:00
Amitay Isaacs	7aa20ccb5c	ctdb-daemon: No need to call event scripts with CTDB_CALLED_BY_USER This was added to support external monitoring using CTDB event scripts. However, it was never used. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 11:41:12 +11:00
Amitay Isaacs	bafa467021	ctdb-daemon: Deprecate RELOAD and STATUS events These events have never been used. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 11:41:12 +11:00
Amitay Isaacs	d21919c8b4	ctdb-common: Refactor code to keep track of child processes This code can then be used to track child processes created with vfork(). Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Amitay Isaacs	094f34e9bf	ctdb-locking: Implement active lock requests limit per database This limit was currently a global limit and not per database. This prevents any database freeze lock requests from getting scheduled if the global limit was reached. Only individual record requests should be limited and database freeze requests should always get scheduled. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Martin Schwenke	e850cddcc4	ctdb-tools/ctdb: New ptrans command Also add test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Martin Schwenke	028fe930b6	ctdb-recoverd: Fix backward compatibility for CTDB_SRVID_TAKEOVER_RUN When running a mixed version cluster, compatibility with older versions was was broken during recent refactorisation. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Amitay Isaacs	41d37058ca	tunables: Remove obsolete tunables Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ca5fc3431573c44d55d09d987c715fb53756fc1f)	2013-10-30 15:37:11 +11:00
Amitay Isaacs	7eb680a95f	build: Move the default CTDB socket from /tmp to /var/run/ctdb Use /var/run/ctdb/ctdbd.socket because there might be other daemons that need sockets in the future. The local daemons test code to create a link for the default convenience socket has to be removed because the link can't be created as a regular user in the new location. This should be OK since all calls to the ctdb tool in the test code should be wrapped in onnode. When debugging tests, a developer will have to set CTDB_SOCKET by hand. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-programmed-with: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dc67a4e24af9d07aead2a1710eeaf5d6cc409201)	2013-10-25 12:06:07 +11:00
Martin Schwenke	b595712f25	ctdbd: Simplify database directory setting logic No need to check if the options are set. The options are always set via static defaults. No need to talloc_strdup() the values via wrapper functions. The options aren't going away. Remove now unused ctdb_set_tdb_dir() and similar functions. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1fe82f3d7b610547ff4945887f15dd6c5798a49b)	2013-10-25 12:06:06 +11:00
Martin Schwenke	bd73e017b0	common: New function ctdb_mkdir_p_or_die() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7b971df79b0b63f83555205eacf48d49ca3a273a)	2013-10-25 12:06:06 +11:00
Martin Schwenke	c07e3830b3	common: New function mkdir_p() Behaves like mkdir -p. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit afe2145d91725daf1399f0a24f1cddcf65f0ec31)	2013-10-25 12:06:06 +11:00
Michael Adam	49fcfd2cb3	ctdb_client.h: fix build on AIX by removing C++-style comments Reported by John P Janosik <jpjanosi@us.ibm.com> Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 1f327401f2e181780937aa3f6c479376ff787f3f)	2013-10-23 00:53:56 +02:00
Martin Schwenke	e782b61732	ctdbd: Pass the public address file location in ctdb context No need to pass it as an extra argument to ctdb_start_daemon. Also ensure options.public_address_list gets a nice static default. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a3d63a9db89d08bb284b3b3a6db773422f21b477)	2013-10-22 15:37:54 +11:00
Martin Schwenke	4adc8f4f09	ctdbd: Default for event_script_dir should use CTDB_BASE Also get rid of ctdb_set_event_script_dir(). It creates an unnecessary copy of something that will be around for the lifetime of the process. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 21b4d1aba00902f1eee0cbf4f082b0794fd5b738)	2013-10-22 15:37:54 +11:00
Martin Schwenke	f9ce563135	ctdbd: Add nodes_file member to struct ctdb_context This allows ctdb_load_nodes_file() to move to ctdb_server.c and ctdb_set_nlist() to become static. Setting ctdb->nodes_file needs to be done early, before the nodes file is loaded. It is now set from CTDB_BASE instead ETCDIR, so setting CTDB_BASE also needs to be done earlier. Unhack ctdbd_test.c - it no longer needs to define ctdb_load_nodes_file(). Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 20e705e63bd3b20837cc3ac92fdcf2a9650ccfc8)	2013-10-22 15:37:54 +11:00
Amitay Isaacs	c5ec04f24e	client: Reimplement persistent transaction code using TRANS3_COMMIT Implementing persistent trasnaction code from Samba. Persistent transaction code was reimplemented in Samba using g_lock.tdb to hold transaction locks and using TRANS3_COMMIT control. Implementation details: 1. When starting a transaction, create a record with "transaction-<dbid>" as key and store current server_id in the structure. 2. If a record already exists, some other client has already started a transaction. Verify that the process corresponding to server_id stored in the record really exists or it's a stale record and overwrite it. 3. All modifications to the actual persistent database are stored in a marshal buffer. 4. When transaction is committed, read the sequence number of the persistent database and increment it. Sequence number record is also stored in the marshal buffer. 5. Send the changed records (marshal buffer) in TRANS3_COMMIT control to all the active nodes. 6. If all controls succeed, verify that the sequence number has been incremented. Commit is successful. If any of the controls fail, abort the transaction. 7. In case sequence number has not yet been incremented, then database recovery has been triggered. So repeat from step 5. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4e0f1971792c9431d8d51dc57d54ecc9e4576dd5)	2013-10-04 15:46:15 +10:00
Amitay Isaacs	be33efa3e4	ctdbd: Remove transaction code related to TRANS2 commits This removes data types and structure elements related to TRANS2 persistent transaction code. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 22a253b7ccf1ff854cddf0b67969dc84d7d6a654)	2013-10-04 15:20:25 +10:00
Amitay Isaacs	91d644325d	ctdbd: Deprecate TRANS2 commit controls Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7d176352986317e63696d74252ff5d8eccb2fee5)	2013-10-04 15:20:25 +10:00
Amitay Isaacs	fe62936bb6	include: Remove unused set_dmaster structure Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2ce3a48cc969d563c26dd295723416c0d7b077a2)	2013-10-04 15:20:25 +10:00
Amitay Isaacs	4ca9b96114	client: Add ctdb_ctrl_getdbseqnum() function Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8cb1fbbfe88327c9c7ab68e8eded586dff611e57)	2013-10-04 15:15:34 +10:00
Amitay Isaacs	5d47f28e15	client: Add ctdb_ctrl_getdbstatistics() function Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1e7fca5cdc1d7205cf084e35aace1a5dc46ea294)	2013-10-04 15:15:34 +10:00
Amitay Isaacs	105afa543e	client: Add ctdb_client_check_message_handlers() function Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c9a9d14c91f203ce964a426a8a1e2c1715af2098)	2013-10-04 15:15:34 +10:00
Martin Schwenke	b33ee7a2a4	recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9)	2013-09-19 12:54:31 +10:00
Martin Schwenke	1793412de2	recoverd: Remove unused CTDB_SRVID_RELOAD_ALL_IPS and handler Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4cd727439a0824ebb8dbcf737d9888ffc3c41184)	2013-09-19 12:54:31 +10:00
Martin Schwenke	5f0913d321	recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56)	2013-09-19 12:54:31 +10:00
Martin Schwenke	4c3f8dc3bb	recoverd: Make the SRVID request structure generic No need for a separate one for each SRVID. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9c22b04d5aa7938a3965bd3144568664eb772ce)	2013-09-19 12:54:30 +10:00
Martin Schwenke	fe7f66547b	client: Remove unused function list_of_active_nodes_except_pnn() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d8a76cf79f07dfb5a93c6c9a13f16e3268c7dd57)	2013-09-11 15:35:03 +10:00
Martin Schwenke	1ae731198a	recoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c This is an internal structure. It was moved into ctdb_private.h a long time ago to allow unit testing. Unit test compilation was changed shortly afterwards to make this unnecessary. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit db57261d7dc264e161659a8c547f44fbd9e88eeb)	2013-08-22 17:00:20 +10:00
Amitay Isaacs	1467b666f2	Revert "LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node" This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504. This is a premature optimization. Record can bounce between nodes very quickly if it is a contended record. There is no need to hold a record on a node unnecessarily. In case record contention becomes bad, enabling sticky records on a database is a better idea. Conflicts: include/ctdb_private.h server/ctdb_tunables.c Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ac417b0003f0116f116834ad2ac51482d25cfa0d)	2013-08-22 14:08:52 +10:00
Amitay Isaacs	de6b97ce4f	Revert "recoverd: Use correct tdb flags when creating missing databases" This reverts commit 10a057d8e15c8c18e540598a940d3548c731b0b4. This approach would not work when creating local databases since currently there is no control to receive TDB flags for remote databases. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ca61eb776ab862bd269e45ee0f9f96e7e1e0e001)	2013-08-14 14:15:33 +10:00
Amitay Isaacs	f15e1a28a7	recoverd: Use correct tdb flags when creating missing databases When creating missing databases either locally or remotely, make sure to use the correct tdb flags from other nodes. Without this, volatile databases can get attached without TDB_INCOMPATIBLE_HASH flag. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 10a057d8e15c8c18e540598a940d3548c731b0b4)	2013-08-01 11:08:25 +10:00
Amitay Isaacs	d8fc36781c	ctdbd: Remove incomplete ctdb_db_statistics_wire structure Instead of maintaining another structure, add an element as place holder for marshall buffer of hot keys. This avoids duplication of the structure. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e73b2e12adc9db1dedb48d32bba3a8406a80f4cd)	2013-07-29 16:00:46 +10:00
Amitay Isaacs	854216236b	Revert "ctdbd: Remove incomplete ctdb_db_statistics_wire structure" The structure cannot be removed without adding support for marshalling keys for hot records. This reverts commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 023ca2e84f5ed064a288526b9c2bc7e06674dd81)	2013-07-29 16:00:46 +10:00
Amitay Isaacs	500b26e48f	common/system: Add ctdb_set_process_name() function Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit fc3689c977f48d7988eed0654fb8e5ce4b8bfc8b)	2013-07-10 14:33:19 +10:00
Amitay Isaacs	d46c24f4d0	ctdbd: No need for DeadlockTimeout tunable The code for deadlock detection and killing smbd process causing deadlock has been removed and replaced with external debug script. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2211cd94bea266547d3e6f167d3160a6b23bec88)	2013-07-10 14:33:18 +10:00
Amitay Isaacs	d36aa928fd	ctdbd: Remove incomplete ctdb_db_statistics_wire structure Send the ctdb_db_statistics directly instead of first copying it to duplicate ctdb_db_statistics_wire structure. This simplifies the implementation of the control to get database statistics. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5)	2013-07-10 14:33:18 +10:00
Martin Schwenke	7290798a41	recoverd: Clean up log messages in remote IP verification The log messages in verify_remote_ip_allocation() are confusing because they don't include the PNN of the problem node, because it is not known in this function. Add the PNN of the node being verified as a function argument and then shuffle the log messages around to make them clearer. Also fold 3 nested if statements into just one. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f0942fa01cd422133fc9398f56b4855397d7bc86)	2013-07-05 15:52:33 +10:00
Martin Schwenke	dbd1759eae	util: New function ctdb_die() This is like ctdb_fatal() but exits cleanly without dumping core or generating a backtrace. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c0a9456692c88a7a5542cd893d8f326524d3f94e)	2013-07-05 15:52:33 +10:00
Amitay Isaacs	c6914e3891	banning: Make ctdb_local_node_got_banned() a void function When this function is called, we are already committed to banning and there is no point in failing this function. In case, freezing of databases fails, it will be fixed from recovery daemon. (This used to be ctdb commit bb178338658b4ae32382a1f62f7c21cee1d4878f)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	622ccd09f9	freeze: Make ctdb_start_freeze() a void function If this function fails due to memory errors, there is no way to recover. The best course of action is to abort. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 46efe7a886f8c4c56f19536adc98a73c22db906a)	2013-07-02 12:59:08 +10:00
Martin Schwenke	6a52a87028	ctdbd: Refactor shutdown sequence Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b32fd04bfbf33062d45365b37a7247e272a76ceb)	2013-06-22 15:51:02 +10:00
Martin Schwenke	6d9667f01c	ctdbd: Add new runstate CTDB_RUNSTATE_FIRST_RECOVERY This adds more serialisation to the startup, ensuring that the "startup" event runs after everything to do with the first recovery (including the "recovered" event). Given that it now takes longer to get to the "startup" state, the initscript needs to wait until ctdbd gets to "first_recovery". Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7)	2013-05-24 14:08:07 +10:00
Martin Schwenke	77671b9ef5	ctdbd: New control CTDB_CONTROL_GET_RUNSTATE Also new client function ctdb_ctrl_get_runstate(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dc4220e6f618cc688b3ca8e52bcb3eec6cb55bb1)	2013-05-24 14:08:07 +10:00
Martin Schwenke	63577c96db	ctdbd: Replace ctdb->done_startup with ctdb->runstate This allows states, including startup and shutdown states, to be clearly tracked. This doesn't include regular runtime "states", which are handled by node flags. Introduce new functions ctdb_set_runstate(), runstate_to_string() and runstate_from_string(). Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28)	2013-05-24 14:08:06 +10:00
Amitay Isaacs	1ddc7b0d10	locking: Remove functions that are not used anymore These functions were used in locking child process to do the locking. With locking helper, these are not required. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c660f33c3eaa1b4a2c4e951c1982979e57374ed4)	2013-05-24 09:06:40 +10:00
Martin Schwenke	54e91df60d	recoverd: Move IP flags into ctdb_takeover.c These should never be seen outside the IP allocation code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e143abd16ccde2e0edfe103673d31a5fb06b6aef)	2013-05-09 12:55:42 +10:00
Martin Schwenke	0445c988e2	recoverd: Fix tunable NoIPTakeoverOnDisabled, rename to NoIPHostOnAllDisabled This really needs to be per-node. The rename is because nodes with this tunable switched on should drop IPs if they become unhealthy (or disabled in some other way). * Add new flag NODE_FLAGS_NOIPHOST, only used in recovery daemon. * Enhance set_ipflags_internal() and set_ipflags() to setup NODE_FLAGS_NOIPHOST depending on setting of NoIPHostOnAllDisabled and/or whether nodes are disabled/inactive. * Replace can_node_servce_ip() with functions can_node_host_ip() and can_node_takeover_ip(). These functions are the only ones that need to look at NODE_FLAGS_NOIPTAKEOVER and NODE_FLAGS_NOIPHOST. They can make the decision without looking at any other flags due to previous setup. * Remove explicit flag checking in IP allocation functions (including unassign_unsuitable_ips()) and just call can_node_host_ip() and can_node_takeover_ip() as appropriate. * Update test code to handle CTDB_SET_NoIPHostOnAllDisabled. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1308a51f73f2e29ba4dbebb6111d9309a89732cc)	2013-05-07 16:20:46 +10:00
Martin Schwenke	fa16cccf02	ctdbd: Remove the "stopped" event It isn't used, superceded by "ipreallocated". Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2bb8596a8af6406ef50e53953884df9d6246a96)	2013-05-06 13:38:21 +10:00
Martin Schwenke	2e59cd5428	ctdbd: New control CTDB_CONTROL_IPREALLOCATED This is an alternative to using ctdb_run_eventscripts() that can be used when in recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 27a44685f0d7a88804b61a1542bb42adc8f88cb1)	2013-05-06 13:38:21 +10:00
Michael Adam	1aa09dd5c3	include: define CTDB_REC_RO_FLAGS - all read-only related record flags This is used for some checks Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c7924ce6404bb18641b00d5fbd2fe9da9aaf7959)	2013-04-24 18:48:31 +10:00
Michael Adam	527976d02a	vacuum: introduce the RECEIVE_RECORDS control This in preparation of turning the vacuming on the lmaster into into a two phase process: - First the node sends the list of records to be vacuumed to all other nodes with this new RECEIVE_RECORDS control. The remote nodes should store the lmaster's empty current copy. - Only those records that could be stored on all other nodes are processed further. They are send to all other nodes with the TRY_DELETE_RECORDS control as before for deletion. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e397702e271af38204fd99733bbeba7c1db3a999)	2013-04-24 18:47:32 +10:00
Martin Schwenke	7ba42d2c89	util: Removed unused declaration of ctdbd_start() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 1e989894764e4cd1d551c44784d91cb295cd790d)	2013-04-18 13:22:12 +10:00
Martin Schwenke	7ccde44d30	include: Move ctdb_start_daemon() from ctdb_client.h to ctdb_private.h It really is internal. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit abb64f62efaa70df4b87c030b96300eafd98e6a3)	2013-04-18 13:22:12 +10:00
Martin Schwenke	dcf1ac34ab	ctdbd: Add --pidfile option Default is not to create a pid file. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 996e74d3db0c50f91b320af8ab7c43ea6b1136af)	2013-04-18 13:21:59 +10:00
Martin Schwenke	4ede763f3b	util: New functions ctdb_set_child_info() and ctdb_is_child_process() Must be called by all child processes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 59b019a97aad9a731f9080ea5be14d0dbdfe03d6)	2013-04-18 13:18:29 +10:00
Michael Adam	b1a6289b44	ctdbd: unimplement the unused SET_DMASTER control Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2e92deef5221ee651028ef87138b3113f1fece91)	2013-04-17 12:44:08 +02:00
Amitay Isaacs	9e0f8fa09c	traverse: Add CTDB_CONTROL_TRAVERSE_ALL_EXT to support withemptyrecords Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit e691df43d20871468142c8fb83f7c7303c4ec307)	2013-04-17 12:30:59 +02:00
Amitay Isaacs	dd050cd4ba	util: Add hex_decode_talloc() to decode hex string into a binary blob Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 307416afda707b687f5e89e8438e45c154a4c806)	2013-03-25 17:45:23 +11:00
Amitay Isaacs	5d7efb4cf1	ctdbd: Add an index db for message list for faster searches When CTDB is busy with lots of smbd, CTDB was spending too much time in daemon_check_srvids() which searches a list of srvids in the registered message handlers. Using a hash based index significantly improves the performance of search in a linked list. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 3e09f25d419635f6dd679b48fa65370f7860be7d)	2013-03-06 15:32:33 +11:00
Martin Schwenke	dab2f6817d	client: New generic node listing function list_of_nodes() Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a73bb56991b8c07ed0e9517ffcf0dc264be30487)	2013-02-20 14:44:38 +11:00
Martin Schwenke	689384a7b4	Logging: Fix breakage when freeing the log ringbuffer Commit a82d3ec12f0fda16d6bfa8442a07595de897c10e broke fetching from the log ringbuffer. The solution there is still generally good: there is no need to keep the ringbuffer in children created by ctdb_fork()... except for those special children that are created to fetch data from the ringbuffer! Introduce a new function ctdb_fork_no_free_ringbuffer() that does everything ctdb_fork() needs to do except free the ringbuffer (i.e. it is the old ctdb_fork() function). The new ctdb_fork() function just calls that function and then frees the ringbuffer in the child. This means all callers of ctdb_fork() have the convenience of having the ringbuffer freed. There are 3 special cases: * Forking the recovery daemon. We want to be able to fetch from the ringbuffer there. * The ringbuffer fetching code. Change the 2 calls in this code (main daemon, recovery daemon) to call ctdb_fork_no_free_ringbuffer() instead. While we're here, clear the log ringbuffer when the recovery deamon is forked, since it will contain a copy of the messages from the main daemon. Note to self: always test... even the most obvious patches... ;-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db5fa00474f8a83f1aa3b603fd756cc9b49ff4)	2013-02-07 11:26:29 +11:00
Martin Schwenke	bc5f0a2b65	ctdbd: Remove command-line option --debug-hung-script Use an environment variable instead. This just means that the initscript exports CTDB_DEBUG_HUNG_SCRIPT and the code checks for the environment variable. The justification for this simplification is that more debug options will be arriving soon and we want to handle them consistently without needing to add a command-line option for each. So, the convention will be to use an environment variable for each debug option. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0581f9a84e58764d194f4e04064c2c5b393c348b)	2013-02-05 16:05:13 +11:00
Martin Schwenke	f2428cadd8	ctdbd: Remove debug_hung_script_ctx The only allocation against this context is by ctdb_fork_with_logging(). This memory is freed by ctdb_log_handler() anyway. There should be no memory leak. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 501461cc3e132d4adee9e91b5d4513a26bae2846)	2013-02-05 16:05:13 +11:00
Martin Schwenke	f2ba0e8a65	Logging: New function ctdb_log_ringbuffer_free() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a4f622e85168f59417c11705f1734e0352e1d44a)	2013-02-05 12:40:30 +11:00
Amitay Isaacs	4a6fa39ff9	daemon: Protect against double free of callback state while shutting down When CTDB is shut down and monitoring has been stopped, monitor_context gets freed and all the callback states hanging off it. This includes callback state for current_monitor, if the current monitor event has not yet finished. As a result, when the shutdown event is called, current_monitor->callback state is not NULL, but it's actually freed and it's a dangling reference. So before executing callback function and freeing callback state check if ctdb->monitor->monitor_context is not NULL. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7d8546ee4353851f0543d0ca2c4c67cb0cc75aea)	2013-01-09 14:39:23 +11:00
Amitay Isaacs	30299c387f	daemon: On shutdown, destroy timed events that check if recoverd is active When CTDB is shutting down, recovery daemon is stopped, but the event that checks if recovery daemon is still alive is not destroyed. So recovery master is restarted during shutdown if CTDB daemon takes longer to shutdown. There are two processes that check if recovery daemon is working. 1. ctdb_check_recd() - which checks every 30 seconds if the recovery daemon process exists. 2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon fails to ping CTDB daemon. Both the events are periodic and need to be destroyed when shutting down. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 746168df2e691058e601016110fae818c6a265c3)	2013-01-09 13:20:26 +11:00
Martin Schwenke	80a2bb84e7	ctdbd: Remove debug option --node-ip, use --listen instead This effectively reverts d96cb02c2c24f9eabbc53d3d38e90dea49cff3e0 Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 496387a585b2c5778c808cf02b8e1435abde4c3e)	2013-01-07 10:35:39 +11:00
Amitay Isaacs	a73f13ada7	daemon: Add a tunable to enable automatic database priority setting Samba versions 3.6.x and older do not set the database priority. This can cause deadlock between Samba and CTDB since the locking order of database will be different. A hack was added for automatic promotion of priority for specific databases to avoid deadlock. This code should not be invoked with Samba version 4.x which correctly specifies the priority for each database. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 4a9e96ad3d8fc46da1cd44cd82309c1b54301eb7)	2013-01-05 01:14:57 +01:00
Amitay Isaacs	13518b9e33	daemon: Check if log_latency_ms is set before using it This fixes a bug where wrong variable is checked. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f81e9add466b1d9b2796c09c6ba63b77296ea149)	2012-11-30 12:21:30 +11:00
Amitay Isaacs	442d9905fe	locking: Do not use RECLOCK for tracking DB locks and latencies RECLOCK is for recovery lock in CTDB. Do not override the meaning for tracking locks on databases. Database lock latency has nothing to do with recovery lock latency. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 54e24a151d2163954e5a2a1c0f41a2b5c19ae44b)	2012-11-14 15:51:59 +11:00
Amitay Isaacs	85c8deca3f	recoverd: Track the nodes that fail takeover run and set culprit count If any of the nodes fail takeover run (either due to timeout or failure to complete within takeover_timeout interval) from main loop, recovery master will give up trying takeover run with following message: "Unable to setup public takeover addresses. Try again later" And as a side-effect the monitoring is disabled on all the nodes. Before ctdb_takeover_run() is called from main loop, monitoring get disabled via startrecovery event. Since ctdb_takeover_run() fails, it never runs recovered event and monitoring does not get re-enabled. In main_loop, ctdb_takeover_run() is called with a takeover_fail_callback. This callback will get called if any of the nodes fail in handling takeip/releaseip/ipreallocated events in ctdb_takeover_run(). Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a5c6bb1fffb8dc3960af113957a1fd080cc7c245)	2012-11-14 10:59:54 +11:00
Martin Schwenke	db5dfe891c	recoverd: Add CTDB_SRVID_GETLOG and CTDB_SRVID_CLEARLOG These support getting and clearing logs from the ring-buffer in the recovery daemon. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cbca233d1e03b2410e0bb63b936328d4a8b3c7b4)	2012-10-22 11:15:36 +11:00
Amitay Isaacs	bc126ccdd4	build: Set CTDB_PATH to /tmp/ctdb.socket if SOCKPATH is not defined When building samba with CTDB, if samba configure/waf does not support setting of SOCKPATH, fallback to /tmp/ctdb.socket. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a9511cf5ecd5bc39b0070f0afa8ac4d4926c6cab)	2012-10-22 09:01:27 +11:00
David Disseldorp	8cbf1a00c4	Build: Set the default ctdb socket path at configure time The ctdb socket path currently defaults to /tmp/ctdb.socket and can be modified at runtime using the --socket=filename option, common to both ctdb and ctdbd binaries. This change allows the default path to be set at configure time using the --with-socketpath=FILE argument. When not specified, the default path remains /tmp/ctdb.socket, documentation remains unchanged as a result. Signed-off-by: David Disseldorp <ddiss@samba.org> (This used to be ctdb commit f92b9c83a2f39fba9a141417a88de96fc8c592ff)	2012-10-21 01:39:08 +11:00
Amitay Isaacs	a00e50e503	ctdbd: Replace lockwait with locking API and remove ctdb_lockwait.c Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2126795153dacb255e441abcb36ee05107b6282a)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	83306337df	ctdbd: locking: Provide non-blocking API for locking of TDB record/db/alldb This introduces a consistent API for handling locks on single record, complete db or all dbs. The locks are taken out in a child process. In cases of timeout, find the processes that currently hold the lock and log. Callback functions for locking requests take locked boolean to indicate whether the lock was successfully obtained or not. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1af99cf0de9919dd89af1feab6d1bd18b95d82ff)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	1011d10a51	common: Add routines to get process and lock information Currently these functions are implemented only for Linux. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit be4051326b0c6a0fd301561af10fd15a0e90023b)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	ef79dc012e	header: Added DB statistics update macros Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a0cdfae7438092f5c605f0608daa536be860b7fe)	2012-10-20 02:48:44 +11:00
Martin Schwenke	8d7562f3f8	common: Debug ctdb_addr_to_str() using new function ctdb_external_trace() We've seen this function report "Unknown family, 0" and then CTDB disappeared without a trace. If we can reproduce it then this might help us to debug it. The idea is that you do something like the following in /etc/sysconfig/ctdb: export CTDB_EXTERNAL_TRACE="/etc/ctdb/config/gcore_trace.sh" When we hit this error than we call out to gcore to get a core file so we can do forensics. This might block CTDB for a few seconds. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7895bc003f087ab2f3181df3c464386f59bfcc39)	2012-10-18 20:05:42 +11:00
Martin Schwenke	4b4e4d8870	ctdbd: Stop takeovers and releases from colliding in mid-air There's a race here where release and takeover events for an IP can run at the same time. For example, a "ctdb deleteip" and a takeover initiated by the recovery daemon. The timeline is as follows: 1. The release code registers a callback to update the VNN. The callback is executed after the eventscripts run the releaseip event. 2. The release code calls the eventscripts for the releaseip event, removing IP from its interface. The takeover code "updates" the VNN saying that IP is on some iface.... even if/though the address is already there. 3. The release callback runs, removing the iface associated with IP in the VNN. The takeover code calls the eventscripts for the takeip event, adding IP to an interface. As a result, CTDB doesn't think it should be hosting IP but IP is on an interface. The recovery daemon fixes this later... but it shouldn't happen. This patch can cause some additional noise in the logs: Release of IP 10.0.2.133/24 on interface eth2 node:2 recoverd:We are still serving a public address '10.0.2.133' that we should not be serving. Removing it. Release of IP 10.0.2.133/24 rejected update for this IP already in flight recoverd:client/ctdb_client.c:2455 ctdb_control for release_ip failed recoverd:Failed to release local ip address In this case the node has started releasing an IP when the recovery daemon notices the addresses is still hosted and initiates another release. This noise is harmless but annoying. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit bfe16cf69bf2eee93c0d831f76d88bba0c2b96c2)	2012-10-11 12:10:45 +11:00
Martin Schwenke	79ea15bf96	ctdbd: New tunable NoIPTakeoverOnDisabled Stops the behaviour where unhealthy nodes can host IPs when there are no healthy nodes. Set this to 1 when an immediate complete outage is preferred when all nodes are unhealthy. The alternative (i.e. default) can lead to undefined behaviour when the shared filesystem is unavailable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a555940fb5c914b7581667a05153256ad7d17774)	2012-10-11 12:10:45 +11:00
Volker Lendecke	a68512c7d8	Correct include for ctdb_protocol.h With an old ctdb_protocol.h installed under /usr/local, ctdb will not compile because the <> form of include will find the header under /usr/local (This used to be ctdb commit c4f5a58471b206e2287c7958c7f29c1f1c0626ac)	2012-10-09 23:13:29 +11:00
Martin Schwenke	e05fc0e7b0	libctdb: add ctdb_getcapabilities() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 140fafef23050d40d66f5b5558c7efcb78f80cd2)	2012-09-28 17:05:34 +10:00
Ronnie Sahlberg	d21337a0fb	Add new command to find which interface is located on (This used to be ctdb commit f07376309e70f5ccdb7de8453caacc71b451ab48)	2012-06-20 15:11:49 +10:00
Ronnie Sahlberg	59565c05cf	STATISTICS: Add tracking of the 10 hottest keys per database measured in hopcount and add mechanisms to dump it using the ctdb dbstatistics command (This used to be ctdb commit 8307c70ed98996b430c470e9641a09fdeeb81bd8)	2012-06-13 16:19:18 +10:00
Amitay Isaacs	7631830152	server: Replace BOOL datatype with bool, True/False with true/false Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6e5cbe8fff71985e5a2fc16b7e9f2b868011ff5d)	2012-05-28 11:22:25 +10:00
Ronnie Sahlberg	e7d21834ae	RECOVER: When we pull databases during recovery, we used to reallocate the databuffer for each entry added. This would normally not be an issue, but for cases where memory is fragmented, this could start to cost significant cpu if we need to reallocate and move to a different region. Change this to instead preallocate , by default, 10MByte chunks to the data buffer. This significantly reduces the number of potential reallocate and move operations that may be required. Create a tunable to override/change how much preallocation should be used. (This used to be ctdb commit 1f262deaad0818f159f9c68330f7fec121679023)	2012-05-25 12:34:06 +10:00
Ronnie Sahlberg	26322d257d	DEBUG: Add checks for and print debug messages when 1) a database contains very many records, 2) when a database is very big, 3) when a single record is very big. Add tunables to control when to log these instances and allow it to be completely turned off by setting the threshold to 0 (This used to be ctdb commit 9ed58fef4991725f75509433496f4d5ffae0ae87)	2012-05-21 13:26:13 +10:00
Ronnie Sahlberg	dce5969d12	Debug: When scripts hang, we may need to collect additional data in order to debug why the script hung. Break this debug and datacollection out into an external script to make it easier to modify what data we need to collect. For now we only collect a pstree so we can see what part of the script we hung in. S1037271 (This used to be ctdb commit 6e68797af67bee36f2bad045f94806e7e98f27e9)	2012-05-17 10:29:03 +10:00
Ronnie Sahlberg	a57eba2bb4	Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)	2012-05-03 14:03:26 +10:00
Ronnie Sahlberg	a367fa6138	RELOADIPS: simplify the reloadips code a bit and also update the "read public address file" to not check if the address exists already locally when we read if from the child process, to stop it from spamming the logs with "We already host ..." messages (This used to be ctdb commit 334ea830f1bf33419f4a1e78f23afd41a852d0f4)	2012-05-01 15:34:26 +10:00
Ronnie Sahlberg	7a1aa560e7	Add new control to reload the public ip address file on a node Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster. Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy. (This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0)	2012-05-01 10:48:08 +10:00
Amitay Isaacs	131d35d67d	includes: Move special tevent defines from tevent.h to includes.h This allows to build against system tevent library. Also include tevent header along with other common headers. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 9ae4389c2c959c5dcd8395fdae2b25ed7e1e873a)	2012-04-13 17:28:14 +10:00
Martin Schwenke	fbe64dec01	Undo damage done by d8d37493478a26c5f1809a5f3df89ffd6e149281 The implementation of DisableIPFailover got intermingled with --nopublicipcheck. This just looks wrong - Ronnie must have been having a bad day. :-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5083b266dd68b292c4275505f3d1b878dbf12f11)	2012-03-22 15:34:52 +11:00
Ronnie Sahlberg	2456f77ca6	NoIPTakeover: change the tunable name for the "dont allow failing addresses over onto the node" to NoIPTakeover (This used to be ctdb commit 35592e618cfd827b6978af6332f80504f232c46a)	2012-03-22 11:05:15 +11:00
Ronnie Sahlberg	befa9df152	Make NoIPFailback a node local setting. Nodes that have NoIPFailback set to !0 can not takeover new ip addresses during failover. Remove the old global setting for this unused tunable and add it as a new node flag. This node flag is only valid/defined within the takeover subsystem in the recovery daemon. Add async functions to collec the NoIPFailback settings for each node. This will later e used to disqualify certain nodes from being takeover targets when we perform reallocation. (This used to be ctdb commit 668f3e88a9e5f598706952b7140547640c85a5ed)	2012-03-22 09:09:57 +11:00
Ronnie Sahlberg	fa3a06246a	STICKY: add prototype code to make records stick to a node to "calm" down if they are found to be very hot and accessed by a lot of clients. This can improve performance and stop clients from having to chase a rapidly migrating/bouncing record (This used to be ctdb commit d0d98f7e45e5084b81335b004d50bddc80cdc219)	2012-03-20 17:12:19 +11:00
Ronnie Sahlberg	e7e51ddb64	LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node This can improve performance slightly on certain workloads where smbds frequently read from the same record (This used to be ctdb commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504)	2012-03-20 12:26:22 +11:00
Ronnie Sahlberg	6a493a0b08	STATISTICS: add per-db hop count statistics (This used to be ctdb commit 1c976d83b1d7dac6f0ef81306774998e4c8b56a1)	2012-03-20 12:11:55 +11:00
Ronnie Sahlberg	c051f67d67	FETCH COLLAPSE : Change the fetch-lock collapse to collapse ALL fetches, including fetch-locks into a single command in flight per record. Also add a tunable to enable/disable this optimization for hot records (This used to be ctdb commit eafd7bbaaa5931546a96c8beae3cf9a39a49c925)	2012-03-20 11:39:00 +11:00
Ronnie Sahlberg	038c946e80	add max hop count buckets to see how bad hopcounts are (This used to be ctdb commit 7d3931298e6477d92f43652c3006b0c426cb1307)	2012-03-20 11:20:53 +11:00
Ronnie Sahlberg	f3600276fc	Add a tunable variable to control how long we defer after a ctdb addip until we force a rebalance and try to failback addresses onto this node Have it default to 300 seconds. (This used to be ctdb commit 49791db7dc74cffd7e88bd73091590cdc1909328)	2012-02-28 06:58:59 +11:00
Ronnie Sahlberg	ef2bd0b016	When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2)	2012-02-28 06:56:04 +11:00
Ronnie Sahlberg	93ec9c589c	Eventscripts: remove the horrible horrible circular reference between state and callback since these two structures do not even share the same parent talloc context. Instead, tie them together via referencing a permanent linked list hung off the ctdb structure. (This used to be ctdb commit a95c02da6c67dc4bd8716b75318a4188301df6f9)	2012-02-23 06:49:47 +11:00
Ronnie Sahlberg	42e477b14e	READONLY: only send a control to schedule fast-vacuuming from child context iff we have a connection open to the main daemon there are some child processes where we do not create a connection to the main daemon (switch_from_server_to_client()) because it is expensive to set up and we normally might not need to talk to the daemon at all via a domainsocket. but we might want to still call to ctdb_ltdb_store() from such chil processes. (This used to be ctdb commit 9e372a08c40087e6b5335aa298e94d88273566a5)	2012-02-21 07:03:44 +11:00
Ronnie Sahlberg	73f8be16c6	ReadOnly: add per-database statistics to view how much delegations/revokes we have (This used to be ctdb commit 751ed46197661eb841042ab6a02855a51dd0b17c)	2012-02-08 15:29:27 +11:00
Ronnie Sahlberg	1eafa68f0f	STATISTICS: add total counts for number of delegations and number of revokes Everytime we give a delegation to another node we count this as one delegation. If the same record is delegated to several nodes we count one for each node. Everytime a record has all its delegations revoked we count this as one revoke. (This used to be ctdb commit b098bcf8007be63889aaed640a951b0eeaa9d191)	2012-02-08 13:42:30 +11:00
Martin Schwenke	ed8a8ee966	libctdb - add ctdb_getvnnmap() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f6039eaece4224b866a98dd49010f278a7b3f015)	2012-02-06 16:00:23 +11:00
Ronnie Sahlberg	e648045499	Merge branch 'master' of ssh://git.samba.org/data/git/ctdb (This used to be ctdb commit 15d8ae8b0f80f95d7839528b8ac60aa0e2485c77)	2012-01-03 12:40:15 +11:00
Michael Adam	e04fad0ee4	vacuum: add new tunable VacuumInterval and mark Vacuum{Default,Min,Max}Interval obsolete And use VacuumInterval instead of VacuumDefaultInterval in the code. (This used to be ctdb commit 78530f40338f511a7cd1d33ada450905742bfa8f)	2011-12-23 17:39:02 +01:00
Michael Adam	a481ca711f	vacuum: add ctdb_local_remove_from_delete_queue() Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit a5065b42a98c709173503e02d217f97792878625)	2011-12-23 17:39:00 +01:00
Martin Schwenke	8b74037633	ctdb tool - generalise nodestring parsing for -n Centralise -n nodestring parsing and add the ability to pass a comma-separated list of node numbers. Listing a node that is disconnected or deleted results in failure, similar to the way passing a single node currently works. All of the auto_all commands inherit this functionality. For now, the non-auto_all commands do not inherit this - they need to be individually tweaked. Therefore, we haven't updated the documentation to advertise this feature. Implemented via a new function parse_nodestring() that parses an optional (pass NULL when not available to indicate "current node") comma-separated list of node numbers or "all". parse_nodestring() can be told to be non-fatal for disconnected/deleted nodes so it can also be used in other contexts (yes, coming soon). main() is changed to call this function. A new magic PNN value CTDB_MULTICAST is added and along with a corresponding option.nodes structure member (a talloc-ed array of PNNs). This is also populated for "all" as well. control_status() has new function pretty_print_flags() factored out so pretty-printed flags can be used in error/debug messages. New function is_partially_online() is also factored out - this simplifies some of the logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 920e3a732eb9e09004edde6cfb3c7db8a004016f)	2011-12-08 17:00:17 +11:00
Ronnie Sahlberg	609149bdc8	LibCTDB: Add support for the 'get interfaces' control and update the ctdb tool to use this interface (This used to be ctdb commit 77dc0c7351071243d9096d3607d7499c82f46ec0)	2011-12-06 13:12:18 +11:00
Mathieu Parent	bb3d6698e9	Move platform-specific code to common/system_* This removes #ifdef AIX and ease the addition of new platforms. (This used to be ctdb commit 2fd1067a075fe0e4b2a36d4ea18af139d03f17bf)	2011-12-06 11:57:11 +11:00
Michael Adam	ad0de5494e	traverse: fix traversing with empty records by adding a new (internal) control CTDB_CONTROL_TRAVERSE_START_EXT By this, the original CTDB_CONTROL_TRAVERSE_START control that is used by e.g. samba's smbstatus, is not changed, so that samba continues working without code change. The CTDB_CONTROL_TRAVERSE_START currently just adds the "withemptyrecords" flag to the state and processon on as CTDB_CONTROL_TRAVERSE_START_EXT. (This used to be ctdb commit 8281bb210858ed04992eacea7f6d02261e0fc1b1)	2011-12-03 02:15:30 +01:00
Ronnie Sahlberg	11f3c947e6	LibCTDB: add support for the check-srvids control (This used to be ctdb commit c32604fd0016de0df14845a2f222edaa3c52a4fa)	2011-11-30 10:00:07 +11:00
Volker Lendecke	5a1da0ac55	Add CTDB_CONTROL_CHECK_SRVID (This used to be ctdb commit ad64ef2c40a2a12b37dbf39142e95c6781c2fc3b)	2011-11-30 09:02:26 +11:00
Ronnie Sahlberg	0420449a6c	Recover Persistent database DB by DB and not record by record Add a new tunable that changes the mode how persistent databases are recovered. RecoveryPDBBySeqNum When set to 1, persistent databases will be recovered in whole from the node which has the highest "__db_sequence_number__" record. This record is managed by samba for those databases where we do persistent writes and have inter-record relations. For these databases we do not want the usual "blend records from all nodes based on individual record RSN" but instead a mode where we pick one instance of the persistent database. If no node was found with a "__db_sequence_number__" record at all, we fail back to the original "recover records independently based on record RSN". Some persistent databases do not contain record interrelations and as such does not contain this special record at all. (This used to be ctdb commit 502150c764298a9fa8c4d8aa445bf7d85d4ee9dc)	2011-11-30 08:48:23 +11:00
Ronnie Sahlberg	3cbff2edd8	LibCTDB: add get persistent db seqnum control (This used to be ctdb commit 6e96a62494bbb2c7b0682ebf0c2115dd2f44f7af)	2011-11-30 08:48:14 +11:00
Michael Adam	31d62794fe	ctdb: add an option --print-recordflags to trigger printing record flags in catdb and dumpdbbackup This changes the default behaviour to not print record flags. (This used to be ctdb commit 2d2ce07c51055d9400b22cd3c1fd682597cb921c)	2011-11-29 13:43:35 +01:00
Michael Adam	e6923904e8	ctdb: add an option --print-hash to enable printing of record hashes when dumping dbs (This used to be ctdb commit efc033c28ade97f9884794256d59a4553e052d5f)	2011-11-29 13:43:34 +01:00
Michael Adam	86cd78efee	ctdb: add an option --print-lmaster to enable printing of lmaster in "ctdb catdb" (This used to be ctdb commit 326f88ef622620cb9e0569c4497bc0e86124beaa)	2011-11-29 13:43:33 +01:00
Michael Adam	dc98c12ac9	ctdb: add an option --print-datasize to only print datasize instead of dumping data in db dumps Used in catdb, cattdb and dumpdbbackup. (This used to be ctdb commit dd866116041e71cbf91e7fd91edcc9501634051d)	2011-11-29 13:43:32 +01:00
Michael Adam	1fcc7651f4	ctdb: add an option --print-emptyrecords to enable printing of empty records in dumping databases this option is used with the commands catdb, cattdb and dumpdbbackup. (This used to be ctdb commit 6ec68a2e667f66d2b194fe48cb75229a2777842e)	2011-11-29 10:30:24 +01:00
Michael Adam	1a31c84348	traverse: add a flag to enable transferring empty records in cluster wide traverse This will be useful for also printing information about empty/deleted records in "ctdb catdb", e.g. for debugging vacuuming issues. (This used to be ctdb commit ddc5da3a0df7701934404192a0a0aa659a806acb)	2011-11-29 10:30:24 +01:00
Martin Schwenke	3ae8273d86	Make some ctdb_takeover.c functions static These were intentionally not static so they could be linked to in unit test programs. However, using the CCAN-style unit tests where relevant code is just included, this is no longer necessary. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d0e9e8554614bd49ffb9ec3509feaa0e80d0f65d)	2011-11-11 14:41:47 +11:00
Martin Schwenke	f186dd90b6	Move some common functions to common/ctdb_ltdb.c Move identical copies of ctdb_null_func(), ctdb_fetch_func(), ctdb_fetch_with_header_func() from ctdb_client.c and ctdb_ltdb_server.c to somewhere common. This is in the context of wanting to run CCAN-style tests where most of the ctdbd code is just included in the test program. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 126cb0d369b2b1aed63801dc4ba0554399e8b7e4)	2011-11-11 14:31:50 +11:00
Martin Schwenke	52ff485958	Added some #ifndefs to stop files being included multiple times. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fdca12c25e6fce6206135b994dedf44265e4eb09)	2011-11-11 14:31:50 +11:00
Ronnie Sahlberg	44de394796	SRVID ranges: Change the ranges for SRVIDs to allow 8 bit prefixes Update the ranges used for SRVID allocation to allow 8 bit prefixes and thus 56 user-defined bits. Define the defacto-use of the 0x00 prefix as a SRVID used to register a process id Upgrade SAMBA/iSCSI/NFS/TEST from a 32 bit prefix each ot a 8 bit prefix each for private use. (This used to be ctdb commit 5de9ec2bdf8067406165bc470becdca87f458ae9)	2011-11-09 08:12:44 +11:00
Ronnie Sahlberg	0e79b2d1e8	Record Fetch Collapse: Collapse multiple fetch request into one single request. When multiple clients fetch the same record concurrently, send only one single fetch across the network and deferr all other fetches locally. This improves performance for hot records and reduces cpu load on ctdb. (This used to be ctdb commit 82d6946ad8b3348e8b9d3d971f24925ade02d1be)	2011-11-08 16:08:28 +11:00
Ronnie Sahlberg	c21ec9fffc	ReadOnly: add readonly record lock requests to libctdb Initial readonly record support in libctdb. New records are not yet created by the library but extising records will be delegated as readonly records. This needs a bit more tests before we can drop the "old style" implementation of client code in client/ctdb_client.c (This used to be ctdb commit fb50a45a21ff56480d76acd1c33c13c323cbf5e2)	2011-10-28 11:55:46 +11:00
Ronnie Sahlberg	8e4bfba75c	ReadOnly: Rename the function ctdb_ltdb_fetch_readonly() to ctdb_ltdb_fetch_with_header() since this is what it actually does. (This used to be ctdb commit 94a5ce4e08e7891f07dbfe4c822ca4be5ab10965)	2011-09-13 18:38:20 +10:00
Ronnie Sahlberg	0dc5584101	Merge branch 'master-readonly-records' into foo Conflicts: Makefile.in tools/ctdb.c (This used to be ctdb commit 0fedef0ffba4178126eee9544c5e2db52f5db893)	2011-09-12 09:34:34 +10:00

1 2 3 4 5 ...

865 Commits