samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-13 13:18:06 +03:00

Author	SHA1	Message	Date
Amitay Isaacs	870409ed1c	recoverd: Always do an early exit from main_loop if node is stopped or banned A stopped or banned node cannot do anything useful. So do not participate in any cluster activity and do not cause any unnecessary network traffic. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2396981c4bcf30530aeb7f4395093cc202105b50)	2013-07-02 12:59:09 +10:00
Amitay Isaacs	7b761c4b97	recoverd: Do not set banning credits on a node if current node is inactive If the current node is banned or stopped, then it should not assign banning credits to other nodes since the current node will not have up-to-date flags of other nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 38304f88e0c634e97d4687c25adef975f71537b8)	2013-07-02 12:59:09 +10:00
Amitay Isaacs	5deebd3b75	banning: Do not come out of ban if databases are not frozen Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a60f228f8380f222f838eb619d2ab55f96f11ac2)	2013-07-02 12:59:09 +10:00
Amitay Isaacs	9a944d71dc	banning: No need to check if banned pnn is for local node If the banned pnn is not the local node, the function returns early. So no need for additional check. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 297d93cecc3c0655e72ecac38508e113bdbeab9c)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	c6914e3891	banning: Make ctdb_local_node_got_banned() a void function When this function is called, we are already committed to banning and there is no point in failing this function. In case, freezing of databases fails, it will be fixed from recovery daemon. (This used to be ctdb commit bb178338658b4ae32382a1f62f7c21cee1d4878f)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	cf1d4bfde3	recoverd: Also check if current node is in recovery when it is banned Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	3052006bf9	recoverd: Set node_flags information as soon as we get nodemap Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8d622660a14c929e365d306147b378ea6ab92175)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	36d8d25b6c	recovered: Remove old comment as the code corresponding to that has gone away Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 34af2cdf686d5d77854cbaa7bbcd8f878e9171c7)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	ea00a5ecf5	banning: Log ban state changes for other nodes at higher debug level Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c6f8407648abb37f2ed781afa5171dad8c9f59e9)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	622ccd09f9	freeze: Make ctdb_start_freeze() a void function If this function fails due to memory errors, there is no way to recover. The best course of action is to abort. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 46efe7a886f8c4c56f19536adc98a73c22db906a)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	cf17247d31	freeze: If priority is invalid here, it's time to abort ctdb_start_freeze() is called from ctdb_control_freeze() which fixes the priority if it's 0 and return error if it's invalid. Other callers of ctdb_start_freeze() are internal to CTDB. So if priority is invalid in ctdb_start_freeze(), definitely something is seriously wrong. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 87716e8f504d659515d3dbcf93badbf106873bc8)	2013-07-02 12:59:08 +10:00
Amitay Isaacs	6fe0089bc0	freeze: Log message from ctdb_start_freeze() and ctdb_control_freeze() This ensures that whenever databases are frozen either via sending control or by calling ctdb_start_freeze(), the action is logged. Since ctdb_control_freeze() calls ctdb_start_freeze(), move logging of message in early return condition if databases are already frozen. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 478e24bceda3fedfba54ccb48faa115df726b819)	2013-07-02 12:57:03 +10:00
Amitay Isaacs	d439aa05a8	recoverd: Print banning message only after verifying pnn Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4be8dff3a4451192f838497b4747273685959bed)	2013-06-28 14:20:12 +10:00
Amitay Isaacs	6960bf78ff	recoverd: When updating flags on nodes, send updated flags and not old flags This was broken by commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa. Instead of a SRVID_SET_NODE_FLAGS message to recovery daemon, a control was sent to the local daemon which in turn informed the recovery daemon. And while doing this change old flags were sent via CONTROL_MODIFY_FLAGS. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7eb2f89979360b6cc98ca9b17c48310277fa89fc)	2013-06-28 14:20:12 +10:00
Martin Schwenke	44e885e98e	ctdbd: Fix panic on overlapping shutdowns The runstate can't be set to SHUTDOWN twice, so the current naive code causes a panic on the 2nd shutdown. This regression was introduced in commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f1b7ca8dc3f34a59c7b3e55748f974ac9ed8f458)	2013-06-22 15:51:16 +10:00
Martin Schwenke	6a52a87028	ctdbd: Refactor shutdown sequence Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b32fd04bfbf33062d45365b37a7247e272a76ceb)	2013-06-22 15:51:02 +10:00
Martin Schwenke	58d499d3ae	logging: Notify parent when logging daemon is up Messages are lost until it is really up because syslogd_is_started is set too early. Adding a pipe to do the notification allows the parent to wait and only set syslogd_is_started when the logging daemon is actually ready. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f3dd2eec200d6eeada2ea19cd7e76f1edfad6167)	2013-06-20 13:01:10 +10:00
Martin Schwenke	26d0746b5d	ctdbd: "init" event should run earlier in daemon initialisation It should run before: * the transport is started; * databases are attached; and * processing configuration files (e.g. nodes, public_addresses). Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 0a0c8543f167e11b75a622513367b083e42cbd3f)	2013-06-20 13:01:09 +10:00
Mathieu Parent	d82b9ae410	build: Fix tdb.h path to enable building with system TDB library (This used to be ctdb commit f8bf99de3a5f56be67aaa67ed836458b1cf73e86)	2013-06-14 16:45:27 +10:00
Amitay Isaacs	d0c858f211	ctdbd: Make sure we don't kill init process by mistake If getpgrp() fails, it will return -1 and that will send KILL signal to init process (PID 1). This does not happen on RHEL, but does on AIX. Reported-by: Chris Cowan <cc@us.ibm.com> Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit edb2a3556d03e248b42f63dd2c62382b723bc98f)	2013-06-14 16:39:48 +10:00
Martin Schwenke	7513f0ba61	recoverd: Log node that causes takoever run to fail Extend takeover_fail_callback() to just log (and not do any ban processing) when the callback data is NULL. Always call ctdb_takeover_run() with the callback so that useful errors are always logged. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c429394afbabaee09f9216dc743419adddf523ea)	2013-06-13 15:55:48 +10:00
Amitay Isaacs	140336383b	ctdbd: Log node state transitions at higher debug level Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit db31dc48bd3135e9242af08bb79b67a17a2b1668)	2013-05-29 17:47:15 +10:00
Martin Schwenke	1ab2bbb349	recoverd: Backward compatibility for nodes without IPREALLOCATED control Consider the case of upgrading a cluster node by node, where some nodes are still running older versions of CTDB without the IPREALLOCATED control. If a "new" node takes over as recovery master and a failover occurs, then it will attempt to send IPREALLOCATED controls to all nodes. The "old" nodes will fail in a fairly nondescript way (result == -1). To try to handle this situation, fall back to the EVENTSCRIPT control to handle "ipreallocated". Only do this on the failed nodes. However, do not do this on nodes that timed out (they've probably implemented the control and we should call the regular fail_callback to get those nodes banned) or for stopped nodes (since they can't actually run the "ipreallocated" event via the EVENTSCRIPT control). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b2654853ce9b7c18c5874b080bc94d3118078a5d)	2013-05-27 15:15:25 +10:00
Amitay Isaacs	a002c6ec12	vacuum: Reduce the priority of non-critical error Since the complete database is not locked when the receive_records control is received, it's possible that we may not be able to obtain lock on a chain. We will try again to store this record. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 32723c9efdad1c6ca4aa53f308ccd9bef1aadfff)	2013-05-24 14:22:16 +02:00
Michael Adam	d1dd29197e	ctdbd: fix comment explaining redirection of CTDB_REQ_CALL redirection. Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit b697625b184227dad1be31a41b7a3fd9bd312e29)	2013-05-24 22:06:24 +10:00
Michael Adam	3f03a3c8a3	ctdbd: remove a nonempty blank line Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit d9e24782a90d9ce29c0e6584b75d2b186142174d)	2013-05-24 22:06:21 +10:00
Michael Adam	a0b20771fe	ctdbd: update comment describing ctdb_call_send_redirect() Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 9a21d417c51fb9cad8f2e87e00ca54d379aef860)	2013-05-24 22:06:16 +10:00
Martin Schwenke	f35e9bba9b	recoverd: Nodes can only takeover IPs if they are in runstate RUNNING Currently the order of the first IP allocation, including the first "ipreallocated" event, and the "startup" event is undefined. Both of these events can (re)start services. This stops IPs being hosted before the "startup" event has completed. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f15dd562fd8c08cafd957ce9509102db7eb49668)	2013-05-24 16:27:55 +10:00
Martin Schwenke	7f03618ae4	recoverd: Handle errors carefully when fetching tunables If a tunable is not implemented on a remote node then this should not be fatal. In this case the takeover run can continue using benign defaults for the tunables. However, timeouts and any unexpected errors should be fatal. These should abort the takeover run because they can lead to unexpected IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c0c27762ea728ed86405b29c642ba9e43200f4ae)	2013-05-24 16:27:55 +10:00
Martin Schwenke	116f62a7b3	recoverd: Set explicit default value when getting tunable from nodes Both of the current defaults are implicitly 0. It is better to make the defaults obvious. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 1190bb0d9c14dc5889c2df56f6c8986db23d81a1)	2013-05-24 16:04:57 +10:00
Martin Schwenke	140f0cfd3b	ctdbd: Update the get_tunable code to return -EINVAL for unknown tunable Otherwise callers can't tell the difference between some other failure (e.g. memory allocation failure) and an unknown tunable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 03fd90d41f9cd9b8c42dc6b8b8d46ae19101a544)	2013-05-24 16:04:50 +10:00
Martin Schwenke	e78b064dcc	recoverd: Whitespace improvements Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 473cfcb019f0cb4a094bf10397f7414f7923ee57)	2013-05-24 15:55:11 +10:00
Martin Schwenke	1a181a4284	recoverd: Use talloc_array_length() for simpler code Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f6792f478197774d2f3b2258c969b67c83e017ab)	2013-05-24 15:55:10 +10:00
Martin Schwenke	94b0e8dfeb	ctdbd: When the "setup" event fails log an error and exit, don't abort The "setup" event can fail when one of the eventscripts fails to run its "setup" event. If this occurs then the eventscript should log an error. The stack trace and core file generated when we abort provides no useful information. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c50eca6fbf49a6c7bf50905334704f8d2d3237d7)	2013-05-24 14:08:07 +10:00
Martin Schwenke	6d9667f01c	ctdbd: Add new runstate CTDB_RUNSTATE_FIRST_RECOVERY This adds more serialisation to the startup, ensuring that the "startup" event runs after everything to do with the first recovery (including the "recovered" event). Given that it now takes longer to get to the "startup" state, the initscript needs to wait until ctdbd gets to "first_recovery". Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7)	2013-05-24 14:08:07 +10:00
Martin Schwenke	77671b9ef5	ctdbd: New control CTDB_CONTROL_GET_RUNSTATE Also new client function ctdb_ctrl_get_runstate(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dc4220e6f618cc688b3ca8e52bcb3eec6cb55bb1)	2013-05-24 14:08:07 +10:00
Martin Schwenke	147f6bb4b8	ctdbd: Start logging process earlier Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f43fe3a560d5915c1a9893256f4e7bfe3d7e290a)	2013-05-24 14:08:07 +10:00
Martin Schwenke	0e678a73b8	ctdbd: Only start recovery daemon and timed events after setup event This deconstructs ctdb_start_transport(), which did much more than starting the transport. This removes a very unlikely race and adds some clarity. The setup event is supposed to set the tunables before the first recovery. However, there was nothing stopping the first recovery from starting before the setup event had completed. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c31feb27dcdb748b5333321c85fe54852dfa1bcf)	2013-05-24 14:08:06 +10:00
Martin Schwenke	63577c96db	ctdbd: Replace ctdb->done_startup with ctdb->runstate This allows states, including startup and shutdown states, to be clearly tracked. This doesn't include regular runtime "states", which are handled by node flags. Introduce new functions ctdb_set_runstate(), runstate_to_string() and runstate_from_string(). Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28)	2013-05-24 14:08:06 +10:00
Amitay Isaacs	c8d577eb80	locking: Set lock helper path once Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 80fbe9364350d42658f7f8af250ac87eb1afbc21)	2013-05-24 09:06:40 +10:00
Amitay Isaacs	1ddc7b0d10	locking: Remove functions that are not used anymore These functions were used in locking child process to do the locking. With locking helper, these are not required. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c660f33c3eaa1b4a2c4e951c1982979e57374ed4)	2013-05-24 09:06:40 +10:00
Amitay Isaacs	90c4fa77b9	locking: Remove functions that are not used anymore These functions were used in locking child process to do the locking. With locking helper, these are not required. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6ea3212a7b177c6c06b1484cf9e8b2f4036653d9)	2013-05-24 09:06:40 +10:00
Amitay Isaacs	ae25420e56	locking: Use separate locking helper binary for locking Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7cde53a6cbe74b1e46f7e1bca298df82c08de866)	2013-05-24 09:06:40 +10:00
Amitay Isaacs	e30978eae1	locking: Create commandline arguments for locking helper Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f665e3d540c90579952e590caa5828acb581ae61)	2013-05-24 09:06:39 +10:00
Amitay Isaacs	30aa825c1e	locking: Add a standalone helper to lock record/db Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a08b6ac19506160f3fb5925ea025027dce07781d)	2013-05-24 09:06:39 +10:00
Amitay Isaacs	c9f4589c13	locking: Use database iterator for unmarking databases Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7630ca4116b476636c27407748088ea335f1a06c)	2013-05-24 09:06:39 +10:00
Amitay Isaacs	65a9195916	locking: Add handler function for unmarking a database Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit adc113055de98fae276f9b501aff5c03cd25ddc8)	2013-05-24 09:06:39 +10:00
Amitay Isaacs	a5133d16e7	locking: Use database iterator for marking databases Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e8ea65b2713417db4a618a9f4633991cfaa93fe6)	2013-05-24 09:06:39 +10:00
Amitay Isaacs	ed359bb1ea	locking: Add handler function for marking a database Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f120e40533780e02ff1cdc41cc6d3af1c4c83258)	2013-05-24 09:06:39 +10:00
Amitay Isaacs	c5c79d63f2	locking: Use database iterator for unlocking databases Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 187ed83f9701c7fa8d3cc476d47c5d2a87d5c308)	2013-05-24 09:06:39 +10:00

1 2 3 4 5 ...

1242 Commits