1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-24 21:34:56 +03:00
Commit Graph

254 Commits

Author SHA1 Message Date
Amitay Isaacs
385325ad90 recoverd: Fix printing of node flags from local information
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 124e2a471aeda9c900fd898178a30522d7d74221)
2013-01-23 16:56:03 +11:00
Amitay Isaacs
96ba396697 recoverd: Create recoverd monitoring timed events off recoverd context
This ensures that when shutting down CTDB, all the timed events
associated with monitoring recoverd are destroyed and recoverd
is not restarted.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 7393e2b290f9879ff72d5c5a9ce933034129f0e8)
2013-01-09 16:22:39 +11:00
Amitay Isaacs
30299c387f daemon: On shutdown, destroy timed events that check if recoverd is active
When CTDB is shutting down, recovery daemon is stopped, but the
event that checks if recovery daemon is still alive is not destroyed.
So recovery master is restarted during shutdown if CTDB daemon takes
longer to shutdown.

There are two processes that check if recovery daemon is working.

1. ctdb_check_recd() - which checks every 30 seconds if the recovery
   daemon process exists.

2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon
   fails to ping CTDB daemon.

Both the events are periodic and need to be destroyed when shutting down.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 746168df2e691058e601016110fae818c6a265c3)
2013-01-09 13:20:26 +11:00
Michael Adam
8732e2356f recovery: data corruption of persistent DBs after recoveries: don't delete emtpy records
The record-by-record mode of recovery deletes empty records.
For persistent databases, this can lead to data corruption
by deleting records that should be there:

- Assume the cluster has been running for a while.

- A record R in a persistent database has been created and
  deleted a couple of times, the last operation being deletion,
  leaving an empty record with a high RSN, say 10.

- Now a node N is turned off.

- This leaves the local database copy of D on N with the empty
  copy of R and RSN 10. On all other nodes, the recovery has deleted
  the copy of record R.

- Now the record is created again while node N is turned off.
  This creates R with RSN = 1 on all nodes except for N.

- Now node N is turned on again. The following recovery will chose
  the older empty copy of R due to RSN 10 > RSN 1.

==> Hence the record is gone after the recovery.

On databases like Samba's registry, this can damage the higher-level
data structures built from the various tdb-level records.

This patch fixes that problem by not deleting empty records in recoveries
for persistent databases.

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 6860c79aea416f56cfd7a6af790bbdf495dbc54e)
2012-11-20 00:48:24 +01:00
Michael Adam
9c65a7ef81 recoverd: fix a comment typo
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 909269a4a3690e1245117ca1af935401455785e6)
2012-11-20 00:48:23 +01:00
Amitay Isaacs
85c8deca3f recoverd: Track the nodes that fail takeover run and set culprit count
If any of the nodes fail takeover run (either due to timeout or failure
to complete within takeover_timeout interval) from main loop, recovery
master will give up trying takeover run with following message:

  "Unable to setup public takeover addresses. Try again later"

And as a side-effect the monitoring is disabled on all the nodes. Before
ctdb_takeover_run() is called from main loop, monitoring get disabled via
startrecovery event. Since ctdb_takeover_run() fails, it never runs
recovered event and monitoring does not get re-enabled.

In main_loop, ctdb_takeover_run() is called with a takeover_fail_callback.
This callback will get called if any of the nodes fail in handling
takeip/releaseip/ipreallocated events in ctdb_takeover_run().

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit a5c6bb1fffb8dc3960af113957a1fd080cc7c245)
2012-11-14 10:59:54 +11:00
Martin Schwenke
db5dfe891c recoverd: Add CTDB_SRVID_GETLOG and CTDB_SRVID_CLEARLOG
These support getting and clearing logs from the ring-buffer in the
recovery daemon.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit cbca233d1e03b2410e0bb63b936328d4a8b3c7b4)
2012-10-22 11:15:36 +11:00
Martin Schwenke
bfbcdea610 recoverd: Clarify some misleading log messages
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 14589bf7c16ba017fe00d4e8bea8cc501546c60f)
2012-10-18 20:05:43 +11:00
Martin Schwenke
a884c8c453 recoverd: Verifying local IPs should only check for unhosted available IPs
Currently it checks for unhosted IPs among the known IPs rather than
available IPs.  This means that a takeover run can be flagged even
when that takeover run will be unable to assign a known, unhosted IP.

Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8)
2012-10-18 20:05:42 +11:00
Martin Schwenke
4719df62d6 recoverd: Track failure of "recovered" event, banning culprits
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9550c497e6d6ef5ee44826c4bd9ed5ad65174263)
2012-10-11 12:10:45 +11:00
Martin Schwenke
62046a8a4c recoverd: When starting a takeover run disable IP verification
Disable for TakeoverTimeout seconds.

Otherwise the the recovery daemon can get overzealous and start trying
to add/delete addresses that it thinks are missing but where the
eventscript just hasn't finished.  This didn't used to matter so much
but it is more important now that concurrent takeip/releaseip/updateip
generate error - we want to avoid spamming the log.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 56fcee3c7730cb12fa666072d5400949af6e5f7c)
2012-10-11 12:10:45 +11:00
Martin Schwenke
735c9107e1 recoverd: All inactive nodes should yield recovery master role
Not just stopped nodes.  In reality, this means that banned nodes will
also yield, since nodes in the other inactive states won't be running
a daemon.

This seems sensible since if another node notices that an inactive
node is the recovery master then it will force an election anyway.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit fc18188b7b63eb0dafbc47e3abf80e306e1dfc31)
2012-08-08 16:15:03 +10:00
Martin Schwenke
97248de3a9 recoverd: An inactive node should not force recovery master elections
An inactive node can't become the recovery master.  So if an inactive
node notices that the recovery master is inactive, it shouldn't force
an election for recovery master and nominate itself as a candidate.
This can cause the recovery master to flip-flop between nodes when all
nodes are inactive.

If there is actually an active node then it will trigger the election.

This is fairly cosmetic but is a step along the way towards ironing
out weirdness when all nodes are stopped.

Also, fix a related comment.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e7dc10da3ced54ea9d719ad167ee42dcca8dce75)
2012-08-08 16:14:52 +10:00
Martin Schwenke
20b75046fa recoverd: main_loop() should not verify local IPs if node is stopped
Doing these checks is pointless and potentially causes unnecessary log
messages.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a0c30c820fd47d4f8620dc060c825be10754f5d1)
2012-08-08 16:11:11 +10:00
Martin Schwenke
ae0cdd137f recoverd: verify_local_ip_allocation() should dup ifaces before early return
If CTDB starts in STOPPED state then it thinks it is in the middle of
a recovery.  rec->ifaces is also NULL and an early exit further down
(that checks to see if a recovery is in process) means that it stays
that way.

However, each time this function is entered the need for a takeover
run is re-flagged.  The takeover run never happens due to the the
early exit, causing a couple of unneeded messages to be logged each
time.

This is avoided by moving the code that sets rec->ifaces so that it is
executed earlier and, in this case, in the middle of a recovery.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f586e8a2911fc6e7f6698f516653145d8fd45dad)
2012-08-08 16:11:11 +10:00
Martin Schwenke
d038b9e8ba recoverd: Fix bogus info in message about changed flags
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9119a568c2b4601318f7751f537dca2f92a7230b)
2012-08-08 16:11:11 +10:00
Ronnie Sahlberg
694c1b269e When we find an ip we shouldnt host, just release it
Dont call a full blown clusterwide ipreallocation,  just release it locally

(This used to be ctdb commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e)
2012-06-20 15:12:05 +10:00
Ronnie Sahlberg
e7d21834ae RECOVER: When we pull databases during recovery, we used to reallocate the databuffer for each entry added. This would normally not be an issue, but for cases where memory is fragmented, this could start to cost significant cpu if we need to reallocate and move to a different region.
Change this to instead preallocate , by default, 10MByte chunks to the data buffer.
This significantly reduces the number of potential reallocate and move  operations that may be required.

Create a tunable to override/change how much preallocation should be used.

(This used to be ctdb commit 1f262deaad0818f159f9c68330f7fec121679023)
2012-05-25 12:34:06 +10:00
Ronnie Sahlberg
a57eba2bb4 Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process
Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned.
Capture SIGCHLD to track also which child processes have terminated.

Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a

(This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)
2012-05-03 14:03:26 +10:00
Ronnie Sahlberg
7a1aa560e7 Add new control to reload the public ip address file on a node
Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster.
Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy.

(This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0)
2012-05-01 10:48:08 +10:00
Ronnie Sahlberg
db411aaada Merge remote branch 'amitay/tevent-sync'
(This used to be ctdb commit 17ff3f240b0d72c72ed28d70fb9aeb3b20c80670)
2012-04-26 08:09:23 +10:00
Amitay Isaacs
4392591555 Remove explicit include of lib/tevent/tevent.h.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 0681014ca5ed2a9b56f63fdace7f894beccf8a9a)
2012-04-13 17:28:14 +10:00
Amitay Isaacs
202791cf72 recoverd: Fix spurious warnings when running with --nopublicipcheck
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 7f8096f56d8274151705ac822b582d972078f8fe)
2012-04-13 15:38:11 +10:00
Martin Schwenke
fbe64dec01 Undo damage done by d8d37493478a26c5f1809a5f3df89ffd6e149281
The implementation of DisableIPFailover got intermingled with
--nopublicipcheck.  This just looks wrong - Ronnie must have been
having a bad day.  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5083b266dd68b292c4275505f3d1b878dbf12f11)
2012-03-22 15:34:52 +11:00
Ronnie Sahlberg
f3600276fc Add a tunable variable to control how long we defer after a ctdb addip until we force a rebalance and try to failback addresses onto this node
Have it default to 300 seconds.

(This used to be ctdb commit 49791db7dc74cffd7e88bd73091590cdc1909328)
2012-02-28 06:58:59 +11:00
Ronnie Sahlberg
ef2bd0b016 When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance.
(This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2)
2012-02-28 06:56:04 +11:00
Ronnie Sahlberg
0420449a6c Recover Persistent database DB by DB and not record by record
Add a new tunable that changes the mode how persistent databases are recovered.
RecoveryPDBBySeqNum

When set to 1, persistent databases will be recovered in whole from the node which
has the highest "__db_sequence_number__" record.
This record is managed by samba for those databases where we do persistent writes and have
inter-record relations.
For these databases we do not want the usual "blend records from all nodes based
on individual record RSN" but instead a mode where we pick one instance of the persistent database.

If no node was found with a "__db_sequence_number__" record at all, we fail back to the original "recover records independently based on record RSN".
Some persistent databases do not contain record interrelations and as such does not
contain this special record at all.

(This used to be ctdb commit 502150c764298a9fa8c4d8aa445bf7d85d4ee9dc)
2011-11-30 08:48:23 +11:00
Stefan Metzmacher
3aa5c979f3 recoverd: try to become the recovery master if we have the capability, but the current master doesn't
metze
(cherry picked from commit 6ba8af28f8a8f79db65120a97d7157dcc5c7e083)

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit ccd67cf7f26713e695000d89d9ce8cfa78bfe00f)
2011-11-29 10:28:52 +01:00
Ronnie Sahlberg
b18a22b820 This breaks the build since the recovery loop is different in master
compared to old 1.0 branches
This must have been mistakenly applied to master when you intended to push
for a different branch i guess.

Revert "recoverd: try to become the recovery master if we have the capability, but the current master doesn't"

This reverts commit a97d417aba85e901540147a4dff4794249442939.

(This used to be ctdb commit c19cb751077b78cf4b6e28a1e3746d4ffedbfd68)
2011-11-29 14:38:02 +11:00
Stefan Metzmacher
b02b55bd12 recoverd: try to become the recovery master if we have the capability, but the current master doesn't
metze

(This used to be ctdb commit a97d417aba85e901540147a4dff4794249442939)
2011-11-26 23:47:00 +01:00
Stefan Metzmacher
7a962685d3 recoverd: let async_getcap_callback() also update ctdb->capabilities
metze

(This used to be ctdb commit ef5b47d1183ee99c39ae63045a994d35255ac829)
2011-11-26 23:30:33 +01:00
Martin Schwenke
02612ea2bc Clean up warnings: remove changed_flags in monitor_helper
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 3e4fa518f02db75e4e4a7f326a71df226913f8a8)
2011-11-09 14:45:01 +11:00
Ronnie Sahlberg
0dc5584101 Merge branch 'master-readonly-records' into foo
Conflicts:

	Makefile.in
	tools/ctdb.c

(This used to be ctdb commit 0fedef0ffba4178126eee9544c5e2db52f5db893)
2011-09-12 09:34:34 +10:00
David Disseldorp
5296da5609 client: add timeout argument to ctdb_attach
Rather than using a fixed 2 second CTDB_CONTROL_GETDBPATH timeout.

(This used to be ctdb commit 9e178671560cb95121e11d718a76b05380ecd6c5)
2011-09-06 13:57:04 +02:00
Ronnie Sahlberg
63dc96cdb2 ReadOnly: Change the ctdb_db structure to keep a uint8_t for flags instead of a boolean for
the persistent flag.
This is the same size as the original boolean but allows ut to add additional flags for the database

(This used to be ctdb commit 7462761638d25880ad46024ad4ef21667eb99a98)
2011-09-01 10:21:55 +10:00
Ronnie Sahlberg
10caf186e1 remove log message we dont need
S1026492

(This used to be ctdb commit c5f6e44b92210519d4bfc24611cae3f9978cc2e5)
2011-08-04 13:49:57 +10:00
Ronnie Sahlberg
ae35e9e5b2 Cleanup of logging messages/spamming
Reduce an infomational message about not performing ip reallocation
from NOTICE(the default) to INFO.
These messages are normal during startup or when stopped/banned when
we will be in recovery mode for a while.

Remove a messager in the loop waiting for initial startup to complete about
the generation being invalid. It is always invalid at this stage before we have
finished initial recovery.

Rate-limit the informational messages for CTDB_WAIT_UNTIL_RECOVERED
so that we only print them once per second for the first 60 seconds and after that only once per 10 minutes.
These messages are normal during startup, but we should not be logging them every second for cases where we will remain in recovery mode during startup for an extended period of time.
Such as if suspended or permabanned.

CQ S1023302

(This used to be ctdb commit 3a0af8780dc595acbed880f288fcbc4f62c862fb)
2011-05-04 10:42:32 +10:00
Michael Adam
2ad1c3f6c7 server: in the VACUUM_FETCH handler, add the VACUUM_MIGRAION to the call flags
This way, the records coming in via this handler, can be treated appropriately.
Namely, they can be deleted instead of being stored when the meet the fast-path
vacuuming criteria (empty, never migrated with data...)

(This used to be ctdb commit fb5d832104970320359b3e474eb291ca3d629380)
2011-03-14 13:35:44 +01:00
Michael Adam
89f27f9424 recoverd: in a recovery, set the MIGRATED_WITH_DATA flag on all records
Those records that are kept after recovery, are non-empty, and
stored identically on all nodes. So this is as if they had been
migrated with data.

Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit 101be642e492a3a54231e2e3e6553a59380fe702)
2011-03-14 13:35:43 +01:00
Ronnie Sahlberg
49a30783d3 If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too.
While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust.

(This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40)
2011-03-01 12:13:58 +11:00
Ronnie Sahlberg
d236c970d0 recoverd: avoid triggering a full recovery if just some ip allocation
has failed.
We dont need to rebuild the databases in this situation, we just
need to try again to sort out the ip address allocations.

(This used to be ctdb commit 044c398ffea23d36ee033c8ddf07d11028197346)
2011-01-11 07:40:49 +11:00
Ronnie Sahlberg
c4006ce844 Add ctdb_fork(0 which will fork a child process and drop the real-time
scheduler for the child.

Use ctdb_fork() from callers where we dont want the child to be running
at real-time privilege.

(This used to be ctdb commit 58795a4c9e0624e20fa3e0023b65127053edd103)
2011-01-11 07:40:41 +11:00
Ronnie Sahlberg
c2c53db49d during ip allocation, there are failure modes where a node might hold a ip address
but thinks it is still unassigned (-1).

add code to the recovery daemon to detect this case and trigger a reallocation
so that the ip gets covered

and change the takeip code to allow for this condition, taking on an ip address that is
already hosted.

cq s1021073

(This used to be ctdb commit 9020baf27cab7821c9094cda185206fb7af0fee7)
2010-12-03 13:30:39 +11:00
Ronnie Sahlberg
7e29fd6093 Dont check remote ip allocation if public ip mgmt is disabled
(This used to be ctdb commit 441ad00af842a8b7b5291de60d8ab08a064f5327)
2010-11-10 14:55:25 +11:00
Ronnie Sahlberg
a6ed66dfd0 dont check the public ip assignment or if even we are hosting them and shouldnt
when public ips have been disabled

(This used to be ctdb commit 7d07a74dc7f907ac757d20626f68e257d7ba16be)
2010-11-10 14:55:24 +11:00
Ronnie Sahlberg
5f76f3c0e2 Add a new tunable : DisableIPFailover that when set to non 0
will stopp any ip reallocations at all from happening.

(This used to be ctdb commit d8d37493478a26c5f1809a5f3df89ffd6e149281)
2010-11-10 14:55:24 +11:00
Ronnie Sahlberg
107d020cfa update/improve the log message related to rerecovery timeouts
(This used to be ctdb commit 8b4d1df3abcae03cf7a339d8390c816682a43019)
2010-09-28 08:47:12 +10:00
Stefan Metzmacher
5e46150490 server/recoverd: if we can't get the recovery lock, ban ourself
metze

(This used to be ctdb commit 80b8889267339b870868841ff077e850bc5b52e2)
2010-09-14 15:49:01 +10:00
Stefan Metzmacher
ff77985f38 server/recoverd: do takeover_run after verifying the reclock file
metze

(This used to be ctdb commit 93df096773c89f21f77b3bcf9aa90bf28881b852)
2010-09-14 15:48:37 +10:00
Ronnie Sahlberg
2e8aac6689 Merge commit 'rusty/ports-from-1.0.112' into foo
(This used to be ctdb commit 13e58d92f5f1723e850a82ae030d0ca57e89b1ee)
2010-08-19 13:17:56 +10:00