1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-24 21:34:56 +03:00
Commit Graph

558 Commits

Author SHA1 Message Date
Ronnie Sahlberg
93ec9c589c Eventscripts: remove the horrible horrible circular reference between state and callback since these two structures do not even share the same parent talloc context.
Instead, tie them together via referencing a permanent linked list hung off the ctdb structure.

(This used to be ctdb commit a95c02da6c67dc4bd8716b75318a4188301df6f9)
2012-02-23 06:49:47 +11:00
Ronnie Sahlberg
42e477b14e READONLY: only send a control to schedule fast-vacuuming from child context iff we have a connection open to the main daemon
there are some child processes where we do not create a connection to the main daemon (switch_from_server_to_client()) because it is expensive to set up and we normally might not need to talk to the daemon at all via a domainsocket.
but we might want to still call to ctdb_ltdb_store() from such chil processes.

(This used to be ctdb commit 9e372a08c40087e6b5335aa298e94d88273566a5)
2012-02-21 07:03:44 +11:00
Ronnie Sahlberg
73f8be16c6 ReadOnly: add per-database statistics to view how much delegations/revokes we have
(This used to be ctdb commit 751ed46197661eb841042ab6a02855a51dd0b17c)
2012-02-08 15:29:27 +11:00
Martin Schwenke
ed8a8ee966 libctdb - add ctdb_getvnnmap()
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f6039eaece4224b866a98dd49010f278a7b3f015)
2012-02-06 16:00:23 +11:00
Michael Adam
e04fad0ee4 vacuum: add new tunable VacuumInterval and mark Vacuum{Default,Min,Max}Interval obsolete
And use VacuumInterval instead of VacuumDefaultInterval in the code.

(This used to be ctdb commit 78530f40338f511a7cd1d33ada450905742bfa8f)
2011-12-23 17:39:02 +01:00
Michael Adam
a481ca711f vacuum: add ctdb_local_remove_from_delete_queue()
Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit a5065b42a98c709173503e02d217f97792878625)
2011-12-23 17:39:00 +01:00
Ronnie Sahlberg
609149bdc8 LibCTDB: Add support for the 'get interfaces' control and update the ctdb tool to use this interface
(This used to be ctdb commit 77dc0c7351071243d9096d3607d7499c82f46ec0)
2011-12-06 13:12:18 +11:00
Mathieu Parent
bb3d6698e9 Move platform-specific code to common/system_*
This removes #ifdef AIX and ease the addition of new platforms.

(This used to be ctdb commit 2fd1067a075fe0e4b2a36d4ea18af139d03f17bf)
2011-12-06 11:57:11 +11:00
Michael Adam
ad0de5494e traverse: fix traversing with empty records by adding a new (internal) control CTDB_CONTROL_TRAVERSE_START_EXT
By this, the original CTDB_CONTROL_TRAVERSE_START control that is
used by e.g. samba's smbstatus, is not changed, so that samba
continues working without code change.

The  CTDB_CONTROL_TRAVERSE_START currently just adds the "withemptyrecords"
flag to the state and processon on as CTDB_CONTROL_TRAVERSE_START_EXT.

(This used to be ctdb commit 8281bb210858ed04992eacea7f6d02261e0fc1b1)
2011-12-03 02:15:30 +01:00
Volker Lendecke
5a1da0ac55 Add CTDB_CONTROL_CHECK_SRVID
(This used to be ctdb commit ad64ef2c40a2a12b37dbf39142e95c6781c2fc3b)
2011-11-30 09:02:26 +11:00
Ronnie Sahlberg
0420449a6c Recover Persistent database DB by DB and not record by record
Add a new tunable that changes the mode how persistent databases are recovered.
RecoveryPDBBySeqNum

When set to 1, persistent databases will be recovered in whole from the node which
has the highest "__db_sequence_number__" record.
This record is managed by samba for those databases where we do persistent writes and have
inter-record relations.
For these databases we do not want the usual "blend records from all nodes based
on individual record RSN" but instead a mode where we pick one instance of the persistent database.

If no node was found with a "__db_sequence_number__" record at all, we fail back to the original "recover records independently based on record RSN".
Some persistent databases do not contain record interrelations and as such does not
contain this special record at all.

(This used to be ctdb commit 502150c764298a9fa8c4d8aa445bf7d85d4ee9dc)
2011-11-30 08:48:23 +11:00
Martin Schwenke
3ae8273d86 Make some ctdb_takeover.c functions static
These were intentionally not static so they could be linked to in unit
test programs.  However, using the CCAN-style unit tests where
relevant code is just included, this is no longer necessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d0e9e8554614bd49ffb9ec3509feaa0e80d0f65d)
2011-11-11 14:41:47 +11:00
Martin Schwenke
f186dd90b6 Move some common functions to common/ctdb_ltdb.c
Move identical copies of ctdb_null_func(), ctdb_fetch_func(),
ctdb_fetch_with_header_func() from ctdb_client.c and
ctdb_ltdb_server.c to somewhere common.

This is in the context of wanting to run CCAN-style tests where most
of the ctdbd code is just included in the test program.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 126cb0d369b2b1aed63801dc4ba0554399e8b7e4)
2011-11-11 14:31:50 +11:00
Ronnie Sahlberg
0e79b2d1e8 Record Fetch Collapse: Collapse multiple fetch request into one single request.
When multiple clients fetch the same record concurrently, send only one single
fetch across the network and deferr all other fetches locally.
This improves performance for hot records and reduces cpu load on ctdb.

(This used to be ctdb commit 82d6946ad8b3348e8b9d3d971f24925ade02d1be)
2011-11-08 16:08:28 +11:00
Ronnie Sahlberg
8e4bfba75c ReadOnly: Rename the function ctdb_ltdb_fetch_readonly() to ctdb_ltdb_fetch_with_header() since this is what it actually does.
(This used to be ctdb commit 94a5ce4e08e7891f07dbfe4c822ca4be5ab10965)
2011-09-13 18:38:20 +10:00
Ronnie Sahlberg
0dc5584101 Merge branch 'master-readonly-records' into foo
Conflicts:

	Makefile.in
	tools/ctdb.c

(This used to be ctdb commit 0fedef0ffba4178126eee9544c5e2db52f5db893)
2011-09-12 09:34:34 +10:00
Ronnie Sahlberg
1c05db2c9c Merge remote branch 'ddiss/master_pmda_and_client_timeouts'
(This used to be ctdb commit 7bebfc7bad8f36e54003b8e25372fdaf54836e21)
2011-09-08 11:22:53 +10:00
David Disseldorp
2f925f1e64 pmda: Attempt reconnects while ctdbd is unavailable
Attempt to reconnect to ctdbd on fetch while it is unreachable.

We must provide our own queue callback wrapper, as ctdb_client_read_cb()
exits on transport failure.

(This used to be ctdb commit 28df6fbf1273b8d095a2bc38dca6a6c35c5c31bd)
2011-09-06 14:01:18 +02:00
Ronnie Sahlberg
783ceca07b Interface monitoring: add a event to trigger every 30 seconds to check that all interfaces referenced by the public address list actually exists.
This will make it much easier to root-cause problems such as
S1029023
when an external application deleted the interface while it is still is in use by ctdbd.

(This used to be ctdb commit 9abf9c919a7e6789695490e2c3de56c21b63fa57)
2011-09-06 17:02:19 +10:00
Ronnie Sahlberg
64378fea58 Check interfaces: when reading the public addresses file to create the vnn list
check that the actual interface exist, print error and fail startup if the interface does not exist.

(This used to be ctdb commit cd33bbe6454b7b0316bdfffbd06c67b29779e873)
2011-09-06 16:11:00 +10:00
Michael Adam
a3e0079568 Add a tunable "AllowClientDBAttach" with default value 1.
When set to 0, clients will not be able to attach to databases
via the db_attach control. This might can be useful for maintenance
where ctdb should be kept running but clients should not be able
to modify databases.

(This used to be ctdb commit ddfeecda87955b4e46777599f678e6926d37f4c4)
2011-09-05 16:17:39 +10:00
Ronnie Sahlberg
206a3c0c66 ReadOnly: add a new control to activate readonly lock capability for a database.
let all databases default to not support this  until enabled through this control

(This used to be ctdb commit 908a07c42e5135a3ba30a625fc4f4e4916de197a)
2011-09-01 11:08:18 +10:00
Ronnie Sahlberg
2902203900 Logging: when we log stdout/stderr messages from eventscripts to the system log, prefix every line of output with the name of the eventscript.
CQ S1028412

(This used to be ctdb commit 392363c04185f47a826fc6ed95038342be2150bf)
2011-08-26 09:39:25 +10:00
Ronnie Sahlberg
37608d70fc ReadOnly: Add clientside code to fetch readonly records
(This used to be ctdb commit 6fccc902bce21fa6ff13ed08ee3341bbf8be39f2)
2011-08-23 10:34:15 +10:00
Ronnie Sahlberg
1bbd4cbf35 ReadOnly: Add a ctdb_ltdb_fetch_readonly() helper function
(This used to be ctdb commit 8551420fb331dd2a897f4619278a981fcefb96e8)
2011-08-23 10:33:17 +10:00
Ronnie Sahlberg
dda2616cf5 ReadOnly: Add a function to start a revoke of all delegations for a record.
This triggers a child process to be created to perform the actual potentially blocking calls that are required.

(This used to be ctdb commit 7d575ee92c95bc4aab78a33bc1aac7ff0811ab3a)
2011-08-23 10:27:31 +10:00
Ronnie Sahlberg
1bb855bd52 ReadOnly: Add functions to register CALLs to a context used to handle deferal of processing of CALL commands.
Once the contexts are freed, the deferred calls are re-issued to the input packet processing functions again.
This is needed when/if a CALL can not currently be processed by the main engine due to the record being locked down for revoking of all delegations.

The data is passed through several layers of callbacks, and finally a timed event callback to ensure that the processing of the packet will be restarted again at the topmost eventloop, avoinding event loop nesting.

(This used to be ctdb commit cc6f78efcfa3b8caeffbd68018e6dfbf81488dce)
2011-08-23 10:25:57 +10:00
Ronnie Sahlberg
3d495c48d2 ReadOnly: Add an extra flag to ctdb_call_local to specify whether we want to write the record and header back to the tdb (for example we do when performing dmaster migrations)
(This used to be ctdb commit b935e83255aeb3754b2fd37cf5611e02f7283514)
2011-08-23 10:25:05 +10:00
Ronnie Sahlberg
1441b77cce ReadOnly: Add "readonly" flag to the ctdb_db_context to indicate if this database supports readonly operations or not. Add a private lock-less tdb file to the ctdb_db_context to use for tracking delegarions for records
Assume all databases will support readonly mode for now and se thte flag for all databases. At later stage we will add support to control on a per database level whether delegations will be supported or not.

(This used to be ctdb commit 502f86f79944df4bac9094f716e54110c511dc24)
2011-08-23 10:24:26 +10:00
Ronnie Sahlberg
f924b3f40e ReadOnly: Add helper functions to manipulate a TDB_DATA as a bitmap for nodes that we are tracking as having a readonly delegation
(This used to be ctdb commit d10084e62d37674bb8d9e31d457fd23e050545be)
2011-08-23 10:09:42 +10:00
Martin Schwenke
5ac67504ca Tests: Initial test code for LCP2 IP allocation algorithm.
Move struct ctdb_public_ip_list to ctdb_private.h and put some
definitions for some functions from ctdb_takeover.c there.  This
allows those functions to be called from unit tests.

Add ctdb_takeover_tests.c and the Makefile support to build it.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9d34be0233edf3bc022345c0494c4b2a4d7f8480)
2011-07-29 09:01:36 +10:00
Martin Schwenke
ff1a81c872 IP allocation - add LCP2 algorithm.
The current non-deterministic IP allocation algorithm balances IPs
across the whole cluster.  It does not consider different
interfaces/VLANs/subnets, so these different groups of IPs aren't
generally well balanced.

This adds the LCP2 algorithm for IP allocation and allows it to be
enabled by setting the "LCP2PublicIPs" tunable to 1.

The LCP2 algorithm calculates the imbalance of a node by totalling the
squares of the distances between each IP on the node.  The IP distance
is defined as the length longest common prefix (LCP) of bits that is
found when comparing 2 IPs.  The imbalance of a cluster is the maximum
imbalance for any node.  At each step the algorithm selects an
allocation to the IP/node combination that results in the choosing the
allocation that best reduces the imbalance of the cluster.

The implementation splits out the IP allocation part of
ctdb_takeover_run() into new function ctdb_takeover_run_core(), and
then extracts out the basic IP assignment code into new functions
basic_allocate_unassigned() and basic_failback().  3 new functions
lcp2_init(), lcp2_allocate_unassigned() and lcp2_failback() implement
the LCP2 algorithm, and are hooked into ctdb_takeover_run_core().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 61fc7fbd0235469df22deb6581c6bd47e30bc0be)
2011-07-29 09:01:17 +10:00
Michael Adam
9e8d6b82b5 server: Use the ctdb_ltdb_store_server() in the ctdb daemon for non-persistent dbs
This is realized by adding a ctdb_ltdb_store_fn function pointer to the db
context and filling it in the attach procedure for non-persistent dbs.

(This used to be ctdb commit df49ec44de80affa5ccc637dec12a20a26e8706e)
2011-03-14 13:35:50 +01:00
Michael Adam
a6b13b21c1 client: add accessor function ctdb_header_from_record_handle().
(This used to be ctdb commit cf57efd440ccc3db381386f4749bfcbf8ac5ecae)
2011-03-14 13:35:50 +01:00
Michael Adam
50bd249990 vacuum: add ctdb_local_schedule_for_deletion()
(This used to be ctdb commit b70bc141d84f7355d2c6c901961b7366db566980)
2011-03-14 13:35:49 +01:00
Michael Adam
8569fcbc83 server: implement a new control SCHEDULE_FOR_DELETION to fill the delete_queue.
(This used to be ctdb commit 680223074e992b32ccf6f42cb80c3fa93074fee7)
2011-03-14 13:35:49 +01:00
Michael Adam
77d4d156d3 control: add macro CHECK_CONTROL_MIN_DATA_SIZE.
This is for the control dispatcher to check whether the input data has
a required minimum size.

(This used to be ctdb commit 2038e745db33cc5c3b4e2db8a00a57ede03906a2)
2011-03-14 13:35:49 +01:00
Michael Adam
9d20f76052 Add a tunable VacuumFastPathCount.
This will control how many fast-path vacuuming runs wil have to
be done, before a full vacuuming will be triggered, i.e. one with
a db-traversal.

(This used to be ctdb commit 0d997ec7e61a7bee2cb05456f9c7d5e6f7a44797)
2011-03-14 13:35:47 +01:00
Michael Adam
cd061f3dee Add a delete_queue to the ctdb database context struct.
This list will be filled by the client using a new
delete control. The list will then be used to implement
a fast-path vacuuming that will traverse this list instead
of traversing the database.

(This used to be ctdb commit 9bbedf786b26bb074f668b31f29a9032af958673)
2011-03-14 13:35:45 +01:00
Ronnie Sahlberg
8acb677c9c Deferred attach : at early startup, defer any db attach calls until we are out of recovery.
(This used to be ctdb commit eeaabd579841f60ab2c5b004cbbb1f5de2bfe685)
2011-03-01 12:13:34 +11:00
Michael Adam
2bd04f0ff8 persistent: add ctdb_persistent_finish_trans3_commits().
This function walks all databases and checks for running trans3 commits.
It sends replies to all of them (with error code) and ends them.
To be called when a recovery finishes.

(This used to be ctdb commit 70ba153b532528bdccea70c5ea28972257f384c1)
2011-02-24 10:35:26 +01:00
Michael Adam
ace1efb878 persistent: add a ctdb_persistent_state member to the ctdb_db context.
To be used for tracking running transaction commits through recoveries.

(This used to be ctdb commit 1237e15df4af58a3d220eea42a4b75e21e65029f)
2011-02-24 10:35:25 +01:00
Ronnie Sahlberg
b57bd0f896 Remove LACOUNT and LACCESSOR and migrate the records immediately.
This concept didnt work out and it is really just as expensive as a full migration
anyway, without the benefit of caching the data for subsequence accesses.

Now, migrate the records immediately on first access.
This will be combined with a "cheap vacuum-lite" for special empty records to
prevent growth of databases.

Later extensions to mimic read-only behaviour of records will include proper shared read-only locking of database records, making the laccessor/lacount read-only access to the data obsolete anyway.

By removing this special case and handling of lacount laccessor makes the codapath where shared read-only locking will be be implemented simpler, and frees up space in the ctdb_ltdb header for use by vacuuming flags as well as read-only locking flags.

(This used to be ctdb commit 155dd1f4885fe142c6f8bd09430f65daf8a17e51)
2011-02-18 10:08:32 +11:00
Ronnie Sahlberg
0f33605866 LockWait congestion.
Add a dlist to track all active lockwait child processes.
Everytime creating a new lockwait handle, check if there is already an
active lockwait process for this database/key and if so,
send the new request straight to the overflow queue.

This means we will only have one active lockwaic child process for a certain key,
even if there were thousands of fetch-lock requests for this key.

When the lockwait processing finishes for the original request, the processing in d_overflow() will automagically process all remaining keys as well.

Add back a --nosetsched argument to make it easier to run under gdb

(This used to be ctdb commit 3e9317a2e1f687b04bf51575d47fcd4faa6e6515)
2011-01-24 12:21:58 +11:00
Rusty Russell
e57362ecf4 ctdb_lockwait: create overflow queue.
Once we have more than 200 children waiting on a particular db, don't create
any more.  Just put them on an overflow queue, and when a child gets a lock
search that queue to see if others were after the same lock (they probably
were).

(This used to be ctdb commit 5e614e8cfd1e9a4b13035a0e400b7a60a745b510)
2011-01-24 12:21:50 +11:00
Ronnie Sahlberg
fcd98a7e59 LIBCTDB: add support for traverse
(This used to be ctdb commit 9463e04038ba36792583f83bd95c1af322dc283a)
2011-01-14 17:38:56 +11:00
Ronnie Sahlberg
c4006ce844 Add ctdb_fork(0 which will fork a child process and drop the real-time
scheduler for the child.

Use ctdb_fork() from callers where we dont want the child to be running
at real-time privilege.

(This used to be ctdb commit 58795a4c9e0624e20fa3e0023b65127053edd103)
2011-01-11 07:40:41 +11:00
Ronnie Sahlberg
ea0df6d882 Revert scheduling back to use real-time processes
Revert this patch:
commit 482c302d46e2162d0cf552f8456bc49573ae729d

We may need to use real-time processes for the main daemon and the recovery daemon to handle the cases where systems come under very high loads.

(This used to be ctdb commit 08bef9dcab6e4da15fc783f8624e5ed09aa060b5)
2011-01-11 07:40:35 +11:00
Ronnie Sahlberg
c69ada0090 add a new ctdb_ltdb function to delete a record in a normal database
(This used to be ctdb commit fe9070ec9be69e6a6fcbf9899e7ced24541c9c3a)
2010-12-07 15:32:30 +11:00
Ronnie Sahlberg
5f76f3c0e2 Add a new tunable : DisableIPFailover that when set to non 0
will stopp any ip reallocations at all from happening.

(This used to be ctdb commit d8d37493478a26c5f1809a5f3df89ffd6e149281)
2010-11-10 14:55:24 +11:00
Ronnie Sahlberg
5ef29f9f25 Update latency countes to show min/max and average
(This used to be ctdb commit 1919e949af4641ffe919123e44b02fb87c13ab9f)
2010-10-11 15:12:24 +11:00
Ronnie Sahlberg
3ba7ac13eb Create a tunable for how often to collect rolling statistics and initialize it to 1 second
(This used to be ctdb commit cb8c779bb5d9862abbe08919aa181a1a1b2bef18)
2010-09-30 15:00:57 +10:00
Ronnie Sahlberg
9f66a93f12 Add rolling statistics that are collected across 10 second intervals.
Add a new command "ctdb stats [num]" that prints the [num] most recent statistics intervals collected.

(This used to be ctdb commit e6e16fcd5a45ebd3739a8160c8fb5f44494edb9e)
2010-09-29 12:14:45 +10:00
Ronnie Sahlberg
41b6e09fb1 Add a new statistics structure to keep the current running statistics
(This used to be ctdb commit 09e5a2fb47c312f71f455cdbf8d9cabcca1041a4)
2010-09-29 12:14:35 +10:00
Ronnie Sahlberg
39c367a68f Create macros to update the statistics counters and use these macros
everywhere instead of manipulating the coutenrs directly.

(This used to be ctdb commit 2e648df890e5713bc575965d87937827b068d0d7)
2010-09-29 12:14:24 +10:00
Ronnie Sahlberg
c6e20a06c7 set up a handler to catch and log debug messages from the tevent layer
(This used to be ctdb commit fdb4c02f595fa207310a9a48da3fefd653fa9e4b)
2010-09-28 08:30:26 +10:00
Ronnie Sahlberg
22ea35f17d adda GETPUBLICIPS control to libctdb and use this in the test example
enhance the test example to show the new releaseip/takeip messages

(This used to be ctdb commit 21cc57883e6c02b0e037211b26d1d866d5d7f03d)
2010-09-15 14:58:11 +10:00
Stefan Metzmacher
0b5bd411ca server/banning: also release all ips if we're banning ourself
metze

(This used to be ctdb commit c386f2c62f06f1c60047b7d4b1ec7a9eec11873c)
2010-09-14 15:50:31 +10:00
Ronnie Sahlberg
a2c874bd61 Implement a new function GETNODEMAP in libctdb.
This function returns a pointer to a nodemap structure.

The returned structure must later be freed by calling ctdb_free_nodemap().

Move the definition of ctdb_sock_addr from ctdb_client.h to ctdb_protocol.h

Move the definition of the node flags, ctdb_node_and_flags and ctdb_node_map from ctdb_private.h to ctdb_protocol.h

Add both sync and async example for ctdb_getnodemap to the test application libctdb/tst.c

(This used to be ctdb commit 31c10eb2b337fd7d8a97a1f9e69b0e7570fec71d)
2010-09-13 14:32:11 +10:00
Ronnie Sahlberg
4c05f1900c Merge commit 'rusty/vacuum-fix-master'
(This used to be ctdb commit dc301b324d2c14a2425a965c076113c4fe97903e)
2010-08-19 13:16:35 +10:00
Ronnie Sahlberg
5aa5f3e7bf Remove the structure ctdb_control_tcp_vnn since this is identical to the structure ctdb_tcp_connection.
Add a new "ctdb deltickle" command to delete tickles from the database.
This can ONLY be used for tickles created by "ctdb addtickle".

Push any "addtickle/deltickle" updates to other nodes every TickleUpdateInterval seconds'

(This used to be ctdb commit acded034e2f0dcae4c2c9e54e16a001caf23caec)
2010-08-18 12:36:03 +10:00
Rusty Russell
af55c910a4 freeze: abort vacuuming when we're going to freeze.
There are some reports of freeze timeouts, and it looks like vacuuming might
be the culprit.  So we add code to tell them to abort when a freeze is
going on.

(This is based on the 1.0.112 branch version 517f05e42f, but far
 simpler since tdb is now robust against processes being killed during
 transaction commit)

CQ:S1018154 & S1018349
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit f5d7dc679501e607c2c83a248a89d3cada9df146)
2010-08-18 10:54:28 +09:30
Rusty Russell
f93440c4b7 event: Update events to latest Samba version 0.9.8
In Samba this is now called "tevent", and while we use the backwards
compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now
a separate tevent_fd_set_auto_close() function.

This is based on Samba version 7f29f817fa.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)
2010-08-18 09:16:31 +09:30
Rusty Russell
7061ceffd8 Report client for queue errors.
We've been seeing "Invalid packet of length 0" errors, but we don't know
what is sending them.  Add a name for each queue, and print nread.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit e6cf0e8f14f4263fbd8b995418909199924827e9)
2010-07-01 23:08:49 +10:00
Rusty Russell
8946028a07 speed startup: add --sloppy-start.
The extra recovery interval wait was introduced in 821333afb458 but no
explanation was provided in that message.  Nonetheless, if starting
the entire cluster for the first time, it should be safe to skip this.

We use the commandline arg --sloppy-start which should discourage
people from using it outside testing.

Seconds between ctdbd first log message and node healthy:
BEFORE:	16.10
AFTER: 4.03

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 509e2e89ae233a0e91998d95267bf62f296a73cd)
2010-06-22 22:52:34 +09:30
Rusty Russell
5f9e4b60ae Delay reusing ids to make protocol more robust
Ronnie and I tracked down a bug which seems to be caused by a node
running so slowly that we timed out the request and reused the request
id before it responded.

The result was that we unlocked the wrong record, leading to the
following:

	ctdbd: tdb_unlock: count is 0
	ctdbd: tdb_chainunlock failed
	smbd[1630912]: [2010/06/08 15:32:28.251716,  0] lib/util_sock.c:1491(get_peer_addr_internal)
	ctdbd: Could not find idr:43
	ctdbd: server/ctdb_call.c:492 reqid 43 not found

This exact problem is now detected, but in general we want to delay
id reuse as long as possible to make our system more robust.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 9eb9c53ef29f4871ae2fe62fc5cb6145fca89eed)
2010-06-10 08:58:55 +09:30
Ronnie Sahlberg
53ea238c6c Add a variable for start/current time to ctdb statistics
and print the time startistics was taken and for how long the statistics have been collected to the "ctdb statistics" output.

(This used to be ctdb commit 1bdfe0cd3370a335b960ce1ef97eade93b0cd2fa)
2010-06-02 13:14:53 +10:00
Ronnie Sahlberg
f1b8bd94bb rename ctdb_message_fn_t to ctdb_msg_fn_t to avoid a conflict with the type of the same name used in libctdb
(This used to be ctdb commit 49e23f8329649e4d9eefab47c9b158fcc7210d07)
2010-06-02 10:00:58 +10:00
Rusty Russell
d5f6026a22 libctdb: reorganize headers: remove ctdb.h, add ctdb_client.h and ctdb_protocol.h
ctdb_client.h is the existing internal client interface (which was mainly
in ctdb.h), and ctdb_protocol.h is the information needed for the wire
protocol only.

ctdb.h will be the new, shiny, libctdb API.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 4bba6b8cd47b352f98d41f9f06258d5ac3c9adef)
2010-05-20 15:18:30 +09:30
Ronnie Sahlberg
6f1221e9e1 Add the number of performed recoveries to the "ctdb statistics" output.
(This used to be ctdb commit fa045733cb81412f0d02ab52d74eabc7efca8b3d)
2010-05-11 09:44:53 +10:00
Rusty Russell
72c275dd70 ctdb: use full range of IDR
This resolves a problem with huge numbers of requests which could overflow
16 bits.  Fortunately, the IDR should scale reasonably well, so we can simply
hold all the requests.

Although noone checks for failure, I added a constant for that.

BZ: 60540
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 72efc4122e37798227c3420a65ed1f706ca9ebe7)
2010-05-11 09:44:43 +10:00
Ronnie sahlberg
46f00a2478 Merge commit 'rusty/signal-fix'
(This used to be ctdb commit 221a9bb41c3a7af0cc65cda78365010893ca1430)
2010-05-03 15:57:41 +10:00
Ronnie Sahlberg
4a43428440 The recent change to the recovery daemon to keep track of and
verify that all nodes agree on the most recent ip address assignments
broke "ctdb moveip ..." since that call would never trigger
a full takeover run and thus would immediately trigger an inconsistency.

Add a new message to the recovery daemon where we can tell the recovery daemon to update its assignments.

BZ62782

(This used to be ctdb commit e7069082e5f0380dcddee247db8754218ce18cab)
2010-05-03 15:47:17 +10:00
Rusty Russell
e1b59b6a47 eventscript: don't do debugging system() from inside signal handler
In the case of a timeout, we dump a log of what's happening to a file
in /tmp.  We do it from the signal handler, which is an unreliable hack
(BZ58365).

Instead, create another (lower-priority) child to do the dump, then
kill the timedout script.

Note that this doesn't quite work as intended (the dump is often run
after the script has been killed), so the next patch resolves this.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 7ee5ecc8d53e78e2dec21197b74a74cc4ae1834c)
2010-04-08 15:13:29 +09:30
Ronnie Sahlberg
06885ea9a7 In the recovery daemon, keep track of which node we have assigned public ip
addresses and verify that the remote nodes have/keep a consistent view of
assigned addresses.

If a remote node has an inconsistent view of addresses visavi the recovery
master this will trigger a full ip reallocation.

(This used to be ctdb commit f3bf2ab61f8dbbc806ec23a68a87aaedd458e712)
2010-04-08 14:25:26 +10:00
Stefan Metzmacher
32d00d0a0d controls: add stups for GET_PUBLIC_IP_INFO, GET_IFACES and SET_IFACE_LINK_STATE
metze

(This used to be ctdb commit a2c9e4578e149eccb2c6183f64a6b657eb95c5e1)
2010-01-20 11:10:59 +01:00
Stefan Metzmacher
37880b0d0a server: use CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE during a takeover run
We know ask for the known and available interfaces.
This means a node gets a RELEASE_IP event for all interfaces
it "knows", but doesn't serve and a node only gets a TAKE_IP event
for "available" interfaces.

metze

(This used to be ctdb commit a695a38e49e7c3e15a9706392dc920eeab1f11ba)
2010-01-20 11:10:59 +01:00
Stefan Metzmacher
b9f6afe4b0 client: add CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE ctdb_ctrl_get_public_ips_flags()
metze

(This used to be ctdb commit 6bd780510058e5589f2f7c3722d37acbba4935ab)
2010-01-20 11:10:58 +01:00
Stefan Metzmacher
15616d3271 reserve upper bits in ctdb_control->flags for opcode specific flags
metze

(This used to be ctdb commit 91122c322fbec08138b92c528d9a946f6727b4fd)
2010-01-20 11:10:58 +01:00
Stefan Metzmacher
bea53c60b8 server: keep the interface information in a list of ctdb_iface structures
metze

(This used to be ctdb commit ff5291778f0752e176539397e9530dcf0e546bea)
2010-01-20 11:10:58 +01:00
Stefan Metzmacher
a1da4e05b5 server: allow multiple interfaces comma separated in public_addresses
metze

(This used to be ctdb commit 33a00ef7233051acdbc66410130ec5d876a8422f)
2010-01-20 11:10:58 +01:00
Stefan Metzmacher
bec35e6441 server: add a ctdb_set_single_public_ip() helper function
metze

(This used to be ctdb commit 400b4806c4a9686a2ee6398b5d7c3e0ca0793fd1)
2010-01-20 11:10:57 +01:00
Stefan Metzmacher
fd06167caa server: add "init" event
This is needed because the "startup" event runs after the initial recovery,
but we need to do some actions before the initial recovery.

metze

(This used to be ctdb commit e953808449c102258abb6cba6f4abf486dda3b82)
2010-01-20 09:44:36 +01:00
Stefan Metzmacher
9cba540514 lib/util: import fault/backtrace handling from samba.
metze

(This used to be ctdb commit 8171d66f0061fe23ed6dfef87ffe63bfc19596eb)
2010-01-20 09:44:36 +01:00
Ronnie Sahlberg
a1d60b1511 Make the size of the in memory ringbuffer for keeping the recent log messages
configureable using --log-ringbuf-size=<num-entries>.

Add an entry in the sysconfig file to set this persistently.

(This used to be ctdb commit c79c2da69bc352f509e7fca4b9172a4b7f23c0f8)
2010-01-15 15:38:56 +11:00
Ronnie Sahlberg
4c722fe34c fix a conflict in the merge from rusty
Merge commit 'rusty/ctdb-no-setsched'

Conflicts:

	server/ctdb_vacuum.c

(This used to be ctdb commit b4365045797f520a7914afdb69ebd1a8dacfa0d9)
2009-12-17 08:18:04 +11:00
Rusty Russell
af2613e16f ctdb: use mlockall, cautiously
We don't want ctdb stalling due to paging; this can be far worse than
scheduling delays.  But if we simply do mlockall(MCL_FUTURE), it
increases the risk that mmap (ie. tdb open) or malloc will fail,
causing us to abort.

This patch is a compromise: we mlock all current pages (including
10k of future stack for expansion) and then relock when a client
asks us to open a TDB.  We warn, but don't exit, if it fails.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 82f778e85440bc713d3f87c08ddc955d3cfce926)
2009-12-16 20:57:20 +10:30
Rusty Russell
c488ba440a Remove RT priority, use niceness.
1) It's buggy.  Code needs to be carefully written (ie. no busy
   loops) to handle running with it, and we fork and run scripts.[1]

2) It makes debugging harder.  If ctdbd loops (as has happened recently)
   it can be extremely hard to get in and see what's happening.  We've already
   seen the valgrind hacks.

3) We have seen recent scheduler problems.  Perhaps they are unrelated,
   but removing this very unusual setup is unlikely to hurt.

4) It doesn't make anything faster.  Under all but the most perverse of
   circumstances, 99% of the cpu gives the same performance as 100%, and
   we will always preempt normal processes anyway.

[1] I made this worse in 0fafdcb8d353 "eventscript: fork() a child for
    each script" by removing the switch_from_server_to_client() which
    restored it, but even that was only for monitor scripts.  Others were
    run with RT priority.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 482c302d46e2162d0cf552f8456bc49573ae729d)
2009-12-16 19:26:22 +10:30
Rusty Russell
f148735928 Add --valgringing flag instead of --nosetsched
The do_setsched was being tested for whether to mmap tdbs: let's make it
explicit.  We can also happily move the kill-child eventscript hack under
this flag.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> 


(This used to be ctdb commit 2ee86cc1f311d7b7504c7b14d142b9c4f6f4b469)
2009-12-16 20:59:15 +10:30
Stefan Metzmacher
f1f0af2b67 server: add CTDB_CONTROL_DB_SET_HEALTHY and CTDB_CONTROL_DB_GET_HEALTH
metze

(This used to be ctdb commit 7332d900538f0cbcd953a723417a0fe31dc9807c)
2009-12-16 08:08:29 +01:00
Stefan Metzmacher
94bc40307a server: Use tdb_check to verify persistent tdbs on startup
Depending on --max-persistent-check-errors we allow ctdb
to start with unhealthy persistent databases.

The default is 0 which means to reject a startup with
unhealthy dbs.

The health of the persistent databases is checked after each
recovery. Node monitoring and the "startup" is deferred
until all persistent databases are healthy.

Databases can become healthy automaticly by a completely
HEALTHY node joining the cluster. Or by an administrator
with "ctdb backupdb/restoredb" or "ctdb wipedb".

metze

(This used to be ctdb commit 15f133d5150ed1badb4fef7d644f10cd08a25cb5)
2009-12-16 08:06:10 +01:00
Stefan Metzmacher
b74918b465 server: open /var/ctdb/state/persistent_health.tdb.X on startup
This node internal tdb will store the HEALTH state of persistent
tdbs.

metze

(This used to be ctdb commit cbda4666be88c11a810a192a70667b57f773ace1)
2009-12-16 08:03:56 +01:00
Stefan Metzmacher
9a96ae0c97 server: only do the mkdir() calls for db_directory* once at the start
metze

(This used to be ctdb commit f30f33685db50860b6cd6fd1b6bdc3066620a78f)
2009-12-16 08:03:56 +01:00
Stefan Metzmacher
b48228e7f9 server: add db_directory_state to ctdb_context
metze

(This used to be ctdb commit 656a6ec5ed81ccfbb86144156a3158e48f105ee4)
2009-12-16 08:03:55 +01:00
Ronnie Sahlberg
640c48c844 Revert "cleanup: remove a tunable we no longer use in the eventscripts any more :"
This reverts commit 401f421fa003d9515df15e759b50b56e0c67d69c.

Conflicts:

	include/ctdb_private.h
	server/ctdb_tunables.c

(This used to be ctdb commit b883d19a495a41a22db37f9c2cf6250fee529de0)
2009-12-16 09:51:17 +11:00
Ronnie Sahlberg
0982299bed Revert "Make fetch_locked more scalable"
This reverts commit 5736e17c139c9a8049e235429aeae0c6c9d0e93d.

(This used to be ctdb commit 3d2d877d877146ca09a28a3a44f4840eb36fd377)
2009-12-15 14:26:28 +11:00
Ronnie Sahlberg
5a7e9900df Merge commit 'obnox/ctdb-wip-trans3' into trans3
(This used to be ctdb commit ac06a0e042e7d024060d6e87a49bda9ccc072c52)
2009-12-15 14:25:55 +11:00
Ronnie Sahlberg
649ba2631d Rename the tunable EventScriptBanCount to EventScriptTimeoutCount
since we no longer ban nodes when dodgy scripts continue to hang.

We now only mark nodes as unhealthy if monitor events fail or timeout. Never ban.

(This used to be ctdb commit 5c8e56fc7a518e115bceac257867739283cf6a1e)
2009-12-14 15:53:23 +11:00
Ronnie Sahlberg
ed6b5a8c68 cleanup: remove a tunable we no longer use in the eventscripts any more :
EventScriptUnhealthyOnTimeout

(This used to be ctdb commit 401f421fa003d9515df15e759b50b56e0c67d69c)
2009-12-14 15:48:47 +11:00
Ronnie Sahlberg
e76561f544 remove the variable "disable when unhealthy"
there is no rational need for a setting where we permanently mark nodes as disabled everytime an eventscript fails

(This used to be ctdb commit 68a8ee99b128a5ec883600735626bdb3bbc9c503)
2009-12-14 15:40:54 +11:00