samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-12 09:18:10 +03:00

Author	SHA1	Message	Date
Martin Schwenke	24b734f084	ctdb-recoverd: LCP2 cleanups * Remove unnecessary candimbl parameter. This parameter can be cheaply calculated in lcp2_failback_candidate(). The compiler will probably do an excellent job optimising it. :-) * Clarify a debug statement This is much clearer than doing a complex recalculation of a known value. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-02-19 12:04:47 +11:00
Martin Schwenke	9e5ef44f32	ctdb-recoverd: Optimise check for rebalance candidates in LCP2 Currently this can be checked many times. However, there's no point calling the rebalance/failback code at all if there are no rebalance candidates. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-02-19 12:04:47 +11:00
Michael Adam	0535f73c3a	ctdb:vacuum: move retrieval of freelist to after vacuum run The fast vacuum run may have increased the freelist size. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Feb 14 03:15:30 CET 2014 on sn-devel-104	2014-02-14 03:15:30 +01:00
Michael Adam	bd474985b1	ctdb:vacuum: fix debug message typo in add_record_to_delete_list() Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-02-14 11:14:31 +11:00
Martin Schwenke	f1a20d748f	ctdb-recoverd: Fix a bug in the LCP2 rebalancing code srcimbl gets changed on every iteration of the loop. The value that should be stored for the new imbalance of the source node is minsrcimbl. To help diagnose this, added some extra debug that can be left in. The extra debug changes the output of a couple of tests. Note that the resulting IP allocations in those tests is unchanged - only the debug output is changed. Also add some new tests that illustrates the bug. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-02-13 02:03:24 +01:00
Amitay Isaacs	276b233c00	ctdb-daemon: Consult CTDB_DEBUG_HUNG_SCRIPT variable before running debug script If CTDB_DEUB_HUNG_SCRIPT is set, use that instead of the default debug script. This code was dropped by mistake in commit `18c1f43210`. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Feb 12 08:47:47 CET 2014 on sn-devel-104	2014-02-12 08:47:47 +01:00
Amitay Isaacs	1566790e5a	ctdb-daemon: Return negative status only if there are known errors If event script does not exist or does not have execute permissions, then return negative errno to distinguish from the exit errors of event script. Signed-off-by: Amitay Isaacs <amitay@gmail.com>	2014-01-31 15:57:49 +11:00
Martin Schwenke	e5778cc172	ctdb/daemon: reloadips must register state of asynchronous controls Otherwise ctdb_client_async_wait() is a no-op. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-01-31 13:36:04 +11:00
Amitay Isaacs	eee450fec2	ctdb-daemon: Simplify listing event scripts using scandir Instead of using RB tree for sorting the script names (incorrectly since it's only using the leading numbers in the script name), use scandir with alphasort. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Jan 21 06:41:25 CET 2014 on sn-devel-104	2014-01-21 06:41:25 +01:00
Amitay Isaacs	cbffbb7c2f	ctdb-daemon: Do not run monitor event if any other event is already running Any currently running monitor events are cancelled if any other events are scheduled. However, this does not stop monitor events to be run when other events are already running. Keep track of the number of active events and schedule monitor event only if there are no active events. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-21 11:30:41 +11:00
Martin Schwenke	e6304d1e1a	ctdb/daemon: Untangle serialisation of 1st recovery -> startup -> monitor At the moment ctdb_check_healthy() is overloaded to wait until the first recovery is complete, handle the "startup" event and also actually handle monitoring. This is untidy and hard to follow. Instead, have the daemon explicitly wait for 1st recovery after the "setup" event. When first recovery is complete, schedule a function to handle the "startup" event. When the "startup" event succeeds then explicitly enable monitoring. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-01-17 17:59:41 +11:00
Martin Schwenke	e77d5f99e3	ctdb/recoverd: Do not refuse disabling takeover runs on inactive nodes Failure might be expected when disabling takeover runs on banned nodes, since they might be suffering from performance problems or similar. More broadly, administrators who reconfigure a cluster that isn't in a happy state aren't necessarily doing something sensible. However, allowing takeover runs to be disabled on inactive nodes stops reconfiguration of stopped nodes. This is probaby an unreasonable limitation, so drop it. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-01-17 17:59:19 +11:00
Martin Schwenke	a955d0bedc	ctdb-recoverd: Ignore failed ipreallocated controls to inactive nodes Currently timeouts for controls to inactive nodes can cause banning credits to be applied. This should not happen. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-01-17 17:59:08 +11:00
Amitay Isaacs	a92fd11ad1	ctdb-daemon: Remove ctdb_fork_with_logging() This function has been replaced with ctdb_vfork_with_logging(). Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Thu Jan 16 04:05:35 CET 2014 on sn-devel-104	2014-01-16 04:05:35 +01:00
Amitay Isaacs	97575e1ba0	ctdb-daemon: Remove unused code to run eventscripts Eventscripts are now executed using a helper. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 12:11:38 +11:00
Amitay Isaacs	18c1f43210	ctdb-daemon: Replace ctdb_fork_with_logging with ctdb_vfork_with_logging (part 2) Use ctdb_event_helper to run debug-hung-script.sh. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 12:11:38 +11:00
Amitay Isaacs	d86662a925	ctdb-daemon: Replace ctdb_fork_with_logging with ctdb_vfork_with_logging (part 1) Use ctdb_event_helper to run eventscripts. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 12:11:37 +11:00
Amitay Isaacs	69324b61f0	ctdb-daemon: Add helper process to execute event scripts Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 12:11:37 +11:00
Amitay Isaacs	2879404388	ctdb-daemon: Add ctdb_vfork_with_logging() This will be used to spawn lightweight helper processes to run eventscripts. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 11:41:12 +11:00
Amitay Isaacs	7aa20ccb5c	ctdb-daemon: No need to call event scripts with CTDB_CALLED_BY_USER This was added to support external monitoring using CTDB event scripts. However, it was never used. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 11:41:12 +11:00
Amitay Isaacs	bafa467021	ctdb-daemon: Deprecate RELOAD and STATUS events These events have never been used. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 11:41:12 +11:00
Martin Schwenke	44a0466ac1	ctdb-recoverd: Only respond to currently queued ipreallocated requests Otherwise new requests can come in during the latter parts of the takeover run when the IP allocation algorithm has already run, and the new requests will be dequeued even though they haven't really be processed. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Martin Schwenke	efc77ba6ac	ctdb-recoverd: For persistent databases a sequence number of 0 is valid Otherwise recovery ends up done by RSN when it is unnecessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Amitay Isaacs	4ea721b2c1	ctdb-locking: Use vfork instead of fork to exec helpers There is a significant overhead using fork() over vfork(), specially when the child process execs a helper. The overhead is in memory space and time. # strace -c ./test_fork 1024 200 count=1024, size=204800, total=200M failed fork=0 time for fork() = 4879.597000 us % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 4.543321 3304 1375 375 clone 0.00 0.000071 0 1033 mmap 0.00 0.000000 0 1 read 0.00 0.000000 0 3 write 0.00 0.000000 0 2 open 0.00 0.000000 0 2 close 0.00 0.000000 0 3 fstat 0.00 0.000000 0 3 mprotect 0.00 0.000000 0 1 munmap 0.00 0.000000 0 3 brk 0.00 0.000000 0 1 1 access 0.00 0.000000 0 1 execve 0.00 0.000000 0 1 arch_prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 4.543392 2429 376 total # strace -c ./test_vfork 1024 200 count=1024, size=204800, total=200M failed fork=0 time for fork() = 82.041000 us % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 96.47 0.001204 1 1000 vfork 3.53 0.000044 0 1033 mmap 0.00 0.000000 0 1 read 0.00 0.000000 0 3 write 0.00 0.000000 0 2 open 0.00 0.000000 0 2 close 0.00 0.000000 0 3 fstat 0.00 0.000000 0 3 mprotect 0.00 0.000000 0 1 munmap 0.00 0.000000 0 3 brk 0.00 0.000000 0 1 1 access 0.00 0.000000 0 1 execve 0.00 0.000000 0 1 arch_prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.001248 2054 1 total Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Amitay Isaacs	0eeb73c187	ctdb-locking: Update current lock statistics when lock is scheduled When a child process is created for a lock request, the current locks statistics should be updated immediately. This will provide accurate information on number of active lock requests. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Amitay Isaacs	3879e9991f	ctdb-locking: Do not merge multiple lock requests to avoid unfair scheduling Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Amitay Isaacs	094f34e9bf	ctdb-locking: Implement active lock requests limit per database This limit was currently a global limit and not per database. This prevents any database freeze lock requests from getting scheduled if the global limit was reached. Only individual record requests should be limited and database freeze requests should always get scheduled. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Martin Schwenke	028fe930b6	ctdb-recoverd: Fix backward compatibility for CTDB_SRVID_TAKEOVER_RUN When running a mixed version cluster, compatibility with older versions was was broken during recent refactorisation. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Martin Schwenke	6fbf399191	ctdb-recoverd: A node refuses to play against itself Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Martin Schwenke	2038d166ad	ctdb-recoverd: Remove duplicate code to update flags during recovery This also happens earlier in do_recovery() and the nodemap is not updated after that, so this update is redundant. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-27 18:46:16 +01:00
Amitay Isaacs	6d1b74f052	ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-19 17:13:03 +01:00
Amitay Isaacs	41d37058ca	tunables: Remove obsolete tunables Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ca5fc3431573c44d55d09d987c715fb53756fc1f)	2013-10-30 15:37:11 +11:00
Martin Schwenke	62076d3089	recoverd: Rebalancing should be done regardless tunable Rebalance target nodes should be set even if a deferred rebalance is not configured. The user can explicitly cause a takeover run. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit afd9b51644af074752d74c412cb4e7ec2eba2c69)	2013-10-30 12:19:49 +11:00
Martin Schwenke	6b42805717	recoverd: Improve an error message in the election code Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 275ed9ebe287e39d891888c13810c70f347af8ac)	2013-10-30 11:34:56 +11:00
Martin Schwenke	5f80f4255c	Revert "if a new node enters the cluster, that node will already be frozen at start" This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94. Furthermore, if a node doesn't force an election but wins it then it can fail to record that it is the new recovery master. This can lead to a reverse split brain where there is no recovery master. This reverts commit c5035657606283d2e35bea40992505e84ca8e7be. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Conflicts: server/ctdb_recoverd.c (This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d)	2013-10-30 11:34:56 +11:00
Martin Schwenke	45b44a7155	ctdbd: When a node is connected, log at DEBUG NOTICE not DEBUG_INFO This is important enough that we should see it when the log level is DEBUG_NOTICE. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit eb8ec5681bfccb26c8ffae72952d54bb0ba46249)	2013-10-29 17:14:56 +11:00
Martin Schwenke	f88cf2d013	Revert "recoverd: Disable takeover runs on other nodes for 5 minutes" 5 minutes is too long to leave the cluster in limbo if the recovery daemon dies during a takeover run, even though this is quite unlikely. We need a new recover master to be able to do takeover runs fairly quickly. This reverts commit 71080676bb4acbd0d9b595a30cf7fe6dddbf426f. (This used to be ctdb commit 3e41170c78fc7a2bf526129c9b7db3739b61c6bf)	2013-10-29 17:14:55 +11:00
Amitay Isaacs	fc7f335843	daemon: Change the default recovery method for persistent databases Use sequence numbers to do recovery for persistent databases instead of RSNs. This fixes the problem of registry corruption during recovery. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 56486d1c01cc8ad0e4b8cee7a22429e72e50f03d)	2013-10-28 18:51:22 +11:00
Amitay Isaacs	4432aef6d1	packaging: Move ctdb/ directory from /var to /var/lib Introduce CTDB_VARDIR variable that points to /var/lib/ctdb by default. This makes CTDB_VARDIR consistent across C code and scripts. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2c09aac71188f43cd592572b10ea30b7a2969678)	2013-10-25 12:06:07 +11:00
Martin Schwenke	b595712f25	ctdbd: Simplify database directory setting logic No need to check if the options are set. The options are always set via static defaults. No need to talloc_strdup() the values via wrapper functions. The options aren't going away. Remove now unused ctdb_set_tdb_dir() and similar functions. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1fe82f3d7b610547ff4945887f15dd6c5798a49b)	2013-10-25 12:06:06 +11:00
Martin Schwenke	a604c3d945	ctdbd: Remove duplicate database directory setting logic Defaults for ctdb->db_directory and similar variables are currently set in 2 places. Change this to set them in only 1 place and make the directories at initialisation time instead of waiting until later. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit d73d84346488a2ed54e6a86f9d7ec641c8e33ace)	2013-10-25 12:06:06 +11:00
Martin Schwenke	e782b61732	ctdbd: Pass the public address file location in ctdb context No need to pass it as an extra argument to ctdb_start_daemon. Also ensure options.public_address_list gets a nice static default. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a3d63a9db89d08bb284b3b3a6db773422f21b477)	2013-10-22 15:37:54 +11:00
Martin Schwenke	463a091a77	ctdbd: Debug locks by default with override from enviroment variable Default is debug_locks.sh, relative to CTDB_BASE. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c11803e3dcc905a45a08d743595e63f9ca445f0d)	2013-10-22 15:37:54 +11:00
Martin Schwenke	4adc8f4f09	ctdbd: Default for event_script_dir should use CTDB_BASE Also get rid of ctdb_set_event_script_dir(). It creates an unnecessary copy of something that will be around for the lifetime of the process. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 21b4d1aba00902f1eee0cbf4f082b0794fd5b738)	2013-10-22 15:37:54 +11:00
Martin Schwenke	f9ce563135	ctdbd: Add nodes_file member to struct ctdb_context This allows ctdb_load_nodes_file() to move to ctdb_server.c and ctdb_set_nlist() to become static. Setting ctdb->nodes_file needs to be done early, before the nodes file is loaded. It is now set from CTDB_BASE instead ETCDIR, so setting CTDB_BASE also needs to be done earlier. Unhack ctdbd_test.c - it no longer needs to define ctdb_load_nodes_file(). Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 20e705e63bd3b20837cc3ac92fdcf2a9650ccfc8)	2013-10-22 15:37:54 +11:00
Martin Schwenke	7c90395136	ctdbd: Don't check CTDB_BASE before setting it, just don't override That's what the 3rd argument to setenv(3) is for... :-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 30ca419aa1c78008f81839497921bbfba480e7fc)	2013-10-22 15:37:54 +11:00
Martin Schwenke	82e5effc40	ctdbd: Fix some errors in the popt configuration That 4th argument isn't a default or similar, so consistently make it 0. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1c0a627df1b510f49c65ffeb4474240c8856cdf2)	2013-10-22 14:34:05 +11:00
Martin Schwenke	fbd2617cb8	recoverd: Remove function reload_nodes_file() It is a 1 line wrapper around ctdb_load_nodes_file(), so use that instead. We need less code... :-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4a5d5935f4410a93a3343d85a24dbcddae2c4c20)	2013-10-22 14:34:03 +11:00
Martin Schwenke	a93361fca2	Revert "null out the pointer before we reload the nodes file" This reverts commit 4b0f32047e8bece0a052bdbe2209afe91b7e8ce3. This is not necessary. It just causes a memory leak. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25fd05505f61dc595c0ef25bb6e332274d5530e8)	2013-10-22 14:34:03 +11:00
Amitay Isaacs	e63232e974	recoverd: Ignore failed flag updates on inactive nodes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-programmed-with: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 484c46eaae056480baf050fd91868f2fd0537985)	2013-10-22 14:34:03 +11:00

1 2 3 4 5 ...

1395 Commits