samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-26 10:04:02 +03:00

Author	SHA1	Message	Date
Michael Adam	8732e2356f	recovery: data corruption of persistent DBs after recoveries: don't delete emtpy records The record-by-record mode of recovery deletes empty records. For persistent databases, this can lead to data corruption by deleting records that should be there: - Assume the cluster has been running for a while. - A record R in a persistent database has been created and deleted a couple of times, the last operation being deletion, leaving an empty record with a high RSN, say 10. - Now a node N is turned off. - This leaves the local database copy of D on N with the empty copy of R and RSN 10. On all other nodes, the recovery has deleted the copy of record R. - Now the record is created again while node N is turned off. This creates R with RSN = 1 on all nodes except for N. - Now node N is turned on again. The following recovery will chose the older empty copy of R due to RSN 10 > RSN 1. ==> Hence the record is gone after the recovery. On databases like Samba's registry, this can damage the higher-level data structures built from the various tdb-level records. This patch fixes that problem by not deleting empty records in recoveries for persistent databases. Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 6860c79aea416f56cfd7a6af790bbdf495dbc54e)	2012-11-20 00:48:24 +01:00
Michael Adam	9c65a7ef81	recoverd: fix a comment typo Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 909269a4a3690e1245117ca1af935401455785e6)	2012-11-20 00:48:23 +01:00
Michael Adam	79468f338a	vacuum: fix a comment typo Pair-Programmed-With: Volker Lendecke <vl@samba.org> Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit bab744e3c49efef2e05dc09e8ea9bd3e3fa58716)	2012-11-19 14:53:14 +01:00
Martin Schwenke	0f1bcebc80	ctdbd: Make the link status of new interfaces more flexible Neither up nor down is a good default value for the link status of a new interface. Up means that IPs can be assigned to interfaces before the true state is known and they can move away quickly if the interface is actually down. Down means that IPs can't be assigned to an interface for a variable amount of time - until a monitor cycle occurs - and this can result in imbalanced IPs. This is a neat compromise. Before the startup event completes, IPs can't be assigned to interfaces because all interfaces begin in a down state. As soon as the startup event completes, IPs can be allocated to any interface that has been marked up by the eventscript. Later, during normal operation, newly added IPs can be assigned to new interfaces immediately. The IPs will still move away if an interface is noticed to be down in the next monitor cycle, but that is the exception rather than the rule. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9275a69a414482f1053ae14528d5972575b9214e)	2012-11-19 15:53:13 +11:00
Amitay Isaacs	442d9905fe	locking: Do not use RECLOCK for tracking DB locks and latencies RECLOCK is for recovery lock in CTDB. Do not override the meaning for tracking locks on databases. Database lock latency has nothing to do with recovery lock latency. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 54e24a151d2163954e5a2a1c0f41a2b5c19ae44b)	2012-11-14 15:51:59 +11:00
Amitay Isaacs	85c8deca3f	recoverd: Track the nodes that fail takeover run and set culprit count If any of the nodes fail takeover run (either due to timeout or failure to complete within takeover_timeout interval) from main loop, recovery master will give up trying takeover run with following message: "Unable to setup public takeover addresses. Try again later" And as a side-effect the monitoring is disabled on all the nodes. Before ctdb_takeover_run() is called from main loop, monitoring get disabled via startrecovery event. Since ctdb_takeover_run() fails, it never runs recovered event and monitoring does not get re-enabled. In main_loop, ctdb_takeover_run() is called with a takeover_fail_callback. This callback will get called if any of the nodes fail in handling takeip/releaseip/ipreallocated events in ctdb_takeover_run(). Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a5c6bb1fffb8dc3960af113957a1fd080cc7c245)	2012-11-14 10:59:54 +11:00
Martin Schwenke	861d5304ac	ctdbd: Fix compilation warning in locking code Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cd64035d71ddff6aebe6c15a49e09527283425d2)	2012-10-31 12:33:25 +11:00
Volker Lendecke	2d1d5d312e	Add a \n to an error message (This used to be ctdb commit 9be3b23adbfc844b71bf1d4ddf0fbc3b269f15fa)	2012-10-25 17:11:15 +11:00
Martin Schwenke	db5dfe891c	recoverd: Add CTDB_SRVID_GETLOG and CTDB_SRVID_CLEARLOG These support getting and clearing logs from the ring-buffer in the recovery daemon. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cbca233d1e03b2410e0bb63b936328d4a8b3c7b4)	2012-10-22 11:15:36 +11:00
Amitay Isaacs	d39fbd60b9	locking: Do not use ctdb_kill() to kill smbd processes ctdb_kill() is used to terminate processes spawned by CTDB. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7d025281ee70c91ebcd4d9a908de1045a689786b)	2012-10-20 02:48:45 +11:00
Amitay Isaacs	1d83df7516	locking: Add database priority handling for older versions of samba In samba versions 3.6.x and older, database priorities are not set. later_db() function implements higher database priority (locking order) for these databases - brlock, g_lock, notify_onelevel, serverid, xattr_tdb Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit edbc8a6669b594d3c413d603e1c9fada9244c2ee)	2012-10-20 02:48:45 +11:00
Amitay Isaacs	3c34207481	locking: Schedule a new lock request everytime a lock is released Since the number of active lock requests is limited to MAX_LOCK_PROCESSES_PER_DB (= 100), any new requests won't get scheduled when they are created. So schedule a pending request once current active request is done. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c8eb4a3170ab8524e638047053831ba547e9cce8)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	a00e50e503	ctdbd: Replace lockwait with locking API and remove ctdb_lockwait.c Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2126795153dacb255e441abcb36ee05107b6282a)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	08ffbc342c	ctdb_recover: Replace static locking functions with locking API Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4456a01d8f54ca6c771d7488048de5f638477d21)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	23f83d58a5	ctdb_freeze: Replace locking functions with locking API Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 01ee86d2aafbcda658ef6acc2bba6d6781ae4047)	2012-10-20 02:48:44 +11:00
Amitay Isaacs	83306337df	ctdbd: locking: Provide non-blocking API for locking of TDB record/db/alldb This introduces a consistent API for handling locks on single record, complete db or all dbs. The locks are taken out in a child process. In cases of timeout, find the processes that currently hold the lock and log. Callback functions for locking requests take locked boolean to indicate whether the lock was successfully obtained or not. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1af99cf0de9919dd89af1feab6d1bd18b95d82ff)	2012-10-20 02:48:44 +11:00
Martin Schwenke	199b971f57	ctdbd: Remove references to forcing running of eventscripts from log messages Running of eventscripts can be initiated from many places, including the recovery daemon. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 440892d75ef73c0aca22f47c0c01712be00cf5b7)	2012-10-18 20:05:43 +11:00
Martin Schwenke	bfbcdea610	recoverd: Clarify some misleading log messages Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 14589bf7c16ba017fe00d4e8bea8cc501546c60f)	2012-10-18 20:05:43 +11:00
Martin Schwenke	a884c8c453	recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8)	2012-10-18 20:05:42 +11:00
Martin Schwenke	ebd9c7a277	Logging: Map TEVENT_DEBUG_FATAL to DEBUG_CRIT This is currently mapped to DEBUG_EMERG. CTDB really has no business logging anything at EMERG level since the whole system is not about to abort or catch fire. EMERG causes the message to appear on the console and on every terminal. That's a bit overzealous! There would be very few situations where logs are being filtered at level below ERROR, so CRIT should certainly suffice. The trigger for this was curious messages saying "No event for <n> seconds!" logged in a user's terminal. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0e56e2dad1861892aa8ba59494ad244f2498314e)	2012-10-18 20:05:42 +11:00
Martin Schwenke	4719df62d6	recoverd: Track failure of "recovered" event, banning culprits Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9550c497e6d6ef5ee44826c4bd9ed5ad65174263)	2012-10-11 12:10:45 +11:00
Martin Schwenke	62046a8a4c	recoverd: When starting a takeover run disable IP verification Disable for TakeoverTimeout seconds. Otherwise the the recovery daemon can get overzealous and start trying to add/delete addresses that it thinks are missing but where the eventscript just hasn't finished. This didn't used to matter so much but it is more important now that concurrent takeip/releaseip/updateip generate error - we want to avoid spamming the log. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 56fcee3c7730cb12fa666072d5400949af6e5f7c)	2012-10-11 12:10:45 +11:00
Martin Schwenke	4b4e4d8870	ctdbd: Stop takeovers and releases from colliding in mid-air There's a race here where release and takeover events for an IP can run at the same time. For example, a "ctdb deleteip" and a takeover initiated by the recovery daemon. The timeline is as follows: 1. The release code registers a callback to update the VNN. The callback is executed after the eventscripts run the releaseip event. 2. The release code calls the eventscripts for the releaseip event, removing IP from its interface. The takeover code "updates" the VNN saying that IP is on some iface.... even if/though the address is already there. 3. The release callback runs, removing the iface associated with IP in the VNN. The takeover code calls the eventscripts for the takeip event, adding IP to an interface. As a result, CTDB doesn't think it should be hosting IP but IP is on an interface. The recovery daemon fixes this later... but it shouldn't happen. This patch can cause some additional noise in the logs: Release of IP 10.0.2.133/24 on interface eth2 node:2 recoverd:We are still serving a public address '10.0.2.133' that we should not be serving. Removing it. Release of IP 10.0.2.133/24 rejected update for this IP already in flight recoverd:client/ctdb_client.c:2455 ctdb_control for release_ip failed recoverd:Failed to release local ip address In this case the node has started releasing an IP when the recovery daemon notices the addresses is still hosted and initiates another release. This noise is harmless but annoying. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit bfe16cf69bf2eee93c0d831f76d88bba0c2b96c2)	2012-10-11 12:10:45 +11:00
Martin Schwenke	79ea15bf96	ctdbd: New tunable NoIPTakeoverOnDisabled Stops the behaviour where unhealthy nodes can host IPs when there are no healthy nodes. Set this to 1 when an immediate complete outage is preferred when all nodes are unhealthy. The alternative (i.e. default) can lead to undefined behaviour when the shared filesystem is unavailable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a555940fb5c914b7581667a05153256ad7d17774)	2012-10-11 12:10:45 +11:00
Martin Schwenke	9aa9abcc19	ctdbd: Avoid unnecessary updateip event The existing code makes one fatally bad assumption: vnn->iface->references can never be -1 (or max-unit32_t in this case). Right now the reference counting is broken so a reference count of -1 is possible and causes a spurious updateip when vnn->iface is the same as best_face. This can occur frequently because we get a lot of redundant takeovers, especially when each IP can only be hosted on one interface. This makes the code much more defensive by noting that when best_iface is the same as vnn->iface there is never a need for an updateip event. This effectively neuters the updateip code path when IPs can only be hosted by a single interface. This should obsolete 6a74515f0a1e24d97cee3ba05d89133aac7ad2b7. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7054e4ded59c6b8f254dcfefaef64da05f25aecd)	2012-10-10 14:54:53 +11:00
Amitay Isaacs	3c1f656764	Revert "when creating/adding a public ip, set the initial interface to be the first interface specified" This reverts commit 4308935ba48ac7a29e7523315acf580019715f0f. This fixes 16_ctdb_config_add_ip.sh test when run against local daemons. When running against local daemons, if the interface is assigned as soon as an IP is added, then takeover would never assign this IP address. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 06dfd13604d08910e07cbf927c338d7b9fce9a2f)	2012-10-07 15:25:34 +11:00
Martin Schwenke	735c9107e1	recoverd: All inactive nodes should yield recovery master role Not just stopped nodes. In reality, this means that banned nodes will also yield, since nodes in the other inactive states won't be running a daemon. This seems sensible since if another node notices that an inactive node is the recovery master then it will force an election anyway. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fc18188b7b63eb0dafbc47e3abf80e306e1dfc31)	2012-08-08 16:15:03 +10:00
Martin Schwenke	97248de3a9	recoverd: An inactive node should not force recovery master elections An inactive node can't become the recovery master. So if an inactive node notices that the recovery master is inactive, it shouldn't force an election for recovery master and nominate itself as a candidate. This can cause the recovery master to flip-flop between nodes when all nodes are inactive. If there is actually an active node then it will trigger the election. This is fairly cosmetic but is a step along the way towards ironing out weirdness when all nodes are stopped. Also, fix a related comment. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e7dc10da3ced54ea9d719ad167ee42dcca8dce75)	2012-08-08 16:14:52 +10:00
Martin Schwenke	20b75046fa	recoverd: main_loop() should not verify local IPs if node is stopped Doing these checks is pointless and potentially causes unnecessary log messages. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a0c30c820fd47d4f8620dc060c825be10754f5d1)	2012-08-08 16:11:11 +10:00
Martin Schwenke	ae0cdd137f	recoverd: verify_local_ip_allocation() should dup ifaces before early return If CTDB starts in STOPPED state then it thinks it is in the middle of a recovery. rec->ifaces is also NULL and an early exit further down (that checks to see if a recovery is in process) means that it stays that way. However, each time this function is entered the need for a takeover run is re-flagged. The takeover run never happens due to the the early exit, causing a couple of unneeded messages to be logged each time. This is avoided by moving the code that sets rec->ifaces so that it is executed earlier and, in this case, in the middle of a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f586e8a2911fc6e7f6698f516653145d8fd45dad)	2012-08-08 16:11:11 +10:00
Martin Schwenke	7df1da1c91	recoverd: Update a log message that has bit-rotted This message used to be correct because the ipreallocated event only handled updating the NAT gateway. However, that has changed so the message needs to be updated. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cc9d96f4248e45ea99c5f00db1526426ac26fbc2)	2012-08-08 16:11:11 +10:00
Martin Schwenke	d038b9e8ba	recoverd: Fix bogus info in message about changed flags Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9119a568c2b4601318f7751f537dca2f92a7230b)	2012-08-08 16:11:11 +10:00
Martin Schwenke	65725d30d4	ctdbd: Remove the worked "Forced" from message about running eventscripts The eventscripts are run after a takeover run and in this case they're not forced. The messages seems to imply that somone has run "ctdb eventscript" when that is not necessarily the case. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3880589db4d563e438126cf5080261fa06b9e242)	2012-07-26 22:10:54 +10:00
Martin Schwenke	75a0041567	ctdbd: Fix ctdb_control_release_ip() on local daemons When running on local daemons no IPs are actually assigned to interfaces. Commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e broke ctdb_control_release_ip() for local daemons because it asks the system which interface the given IP is on, instead of the old behaviour of trusting CTDB's internal records. For local deamons (i.e. !ctdb->do_checkpublicip) revert to the old behaviour of looking up the interface internally. This is good enough, given that the tests don't tend to misconfigure the addresses. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 38e8651b955afdbaf0ae87c24c55c052f8209290)	2012-07-26 22:10:54 +10:00
Amitay Isaacs	23a460602f	Remove tevent_loop_allow_nesting() Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 538c68d0e83e14f0000981ee06408b8f0035be37)	2012-07-16 12:12:05 +10:00
Amitay Isaacs	c4236ec8fb	ctdbd: Return explicit boolean values for function returning bool Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 3de2830ae68241ee95bcc14dc1bb896ff18d86ce)	2012-07-16 12:12:05 +10:00
Amitay Isaacs	e379fc3ea5	Fix compiler warnings. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit d29e1880c8ce7219e065d31b47b0e8ad9e83146d)	2012-07-13 14:50:56 +10:00
Gregor Beck	3fd0b8a5a5	ctdbd: refuse attaching with "persistent" to a non-persistent db and v.v. Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 1ebbaa620b3cfb9ff373828e4aaa84246cf3ec25)	2012-07-03 11:30:04 +02:00
Ronnie Sahlberg	694c1b269e	When we find an ip we shouldnt host, just release it Dont call a full blown clusterwide ipreallocation, just release it locally (This used to be ctdb commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e)	2012-06-20 15:12:05 +10:00
Ronnie Sahlberg	c7e648c2d1	When we release an ip, get the interface name from the kernel instead of using the interface where ctdb thinks the ip is hosted at. The difference is that this now allows us to handle cases where we want to release an ip but ctdbd does not know which interface the ip is assigned on. (user has used 'ip addr add...' and manually assigned an ip to the wrong interface) (This used to be ctdb commit c6bf22ba5c01001b7febed73dd16a03bd3fd2bed)	2012-06-20 15:11:56 +10:00
Ronnie Sahlberg	59565c05cf	STATISTICS: Add tracking of the 10 hottest keys per database measured in hopcount and add mechanisms to dump it using the ctdb dbstatistics command (This used to be ctdb commit 8307c70ed98996b430c470e9641a09fdeeb81bd8)	2012-06-13 16:19:18 +10:00
Martin Schwenke	55be3c1239	Reimplement logging of long running events Reimplement 5aba53e6adcfcd7edbdac9e30aa5fcba176aca00 using tevent trace points. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 98e1b46adba11b9549b5c5976e1f561fe732fa6e)	2012-06-12 16:10:01 +10:00
Ronnie Sahlberg	2ca402062c	Run the shutdown eventscript before we tear down the transport This allows eventscripts to still be able to call and use ctdb during the shutdown phase. (This used to be ctdb commit 1a6a011c772f7d302d114d7c8a151fa7820ec85f)	2012-05-30 11:51:38 +10:00
Amitay Isaacs	7631830152	server: Replace BOOL datatype with bool, True/False with true/false Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6e5cbe8fff71985e5a2fc16b7e9f2b868011ff5d)	2012-05-28 11:22:25 +10:00
Ronnie Sahlberg	bbd33d6394	RECOVERY: Increase the time we allow before timing out recovery related tasks. If the system is temporarily taking unusually long to perform these tasks it is better to wait a lot longer and allow the tasks to complete than timing out repeatedly and then becomming banned. (This used to be ctdb commit 03fa2a517247eb2adfba67248e2466f17ea14418)	2012-05-25 12:34:14 +10:00
Ronnie Sahlberg	e7d21834ae	RECOVER: When we pull databases during recovery, we used to reallocate the databuffer for each entry added. This would normally not be an issue, but for cases where memory is fragmented, this could start to cost significant cpu if we need to reallocate and move to a different region. Change this to instead preallocate , by default, 10MByte chunks to the data buffer. This significantly reduces the number of potential reallocate and move operations that may be required. Create a tunable to override/change how much preallocation should be used. (This used to be ctdb commit 1f262deaad0818f159f9c68330f7fec121679023)	2012-05-25 12:34:06 +10:00
Ronnie Sahlberg	26322d257d	DEBUG: Add checks for and print debug messages when 1) a database contains very many records, 2) when a database is very big, 3) when a single record is very big. Add tunables to control when to log these instances and allow it to be completely turned off by setting the threshold to 0 (This used to be ctdb commit 9ed58fef4991725f75509433496f4d5ffae0ae87)	2012-05-21 13:26:13 +10:00
Ronnie Sahlberg	dce5969d12	Debug: When scripts hang, we may need to collect additional data in order to debug why the script hung. Break this debug and datacollection out into an external script to make it easier to modify what data we need to collect. For now we only collect a pstree so we can see what part of the script we hung in. S1037271 (This used to be ctdb commit 6e68797af67bee36f2bad045f94806e7e98f27e9)	2012-05-17 10:29:03 +10:00
Ronnie Sahlberg	a57eba2bb4	Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)	2012-05-03 14:03:26 +10:00
Ronnie Sahlberg	a367fa6138	RELOADIPS: simplify the reloadips code a bit and also update the "read public address file" to not check if the address exists already locally when we read if from the child process, to stop it from spamming the logs with "We already host ..." messages (This used to be ctdb commit 334ea830f1bf33419f4a1e78f23afd41a852d0f4)	2012-05-01 15:34:26 +10:00

1 2 3 4 5 ...

1103 Commits