IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Recovery Master, this is one of the nodes in the cluster that has been designated to
be the "recovery master".
The recovery master is responsible for performing full checks of cluster and cluster node consistency and is also responsible for performing the actual database recovery procedure.
Only one node at a time can be the recovery master.
This is ensured by CTDB using a lock on a single file in the shared gpfs filesystem:
/etc/sysconfig/ctdb :
...
# Options to ctdbd. This is read by /etc/init.d/ctdb
# you must specify the location of a shared lock file across all the
# nodes. This must be on shared storage
# there is no default here
CTDB_RECOVERY_LOCK=/gpfs/.ctdb/shared
...
In order to prevent that two nodes become recovery master at the same time (==split brain)
CTDB here relies on GPFS that GPFS will guarantee coherent locking across the cluster.
Thus CTDB relies on that GPFS MUST only allow one ctdb process on one node to take out and
hold this lock.
The recovery master is designated through an election process.
VNNMAP
======
The VNNMAP is a list of all nodes in the cluster that is currently part of the cluster
and participates in hosting the cluster databases.
All nodes that are CONNECTED but not BANNED be present in the VNNMAP.
The VNNMAP is the list of LMASTERS for the cluster as reported by 'ctdb status' "
...
Size:3
hash:0 lmaster:0
hash:1 lmaster:1
hash:2 lmaster:2
...
CLUSTER MONITORING
==================
All nodes in the cluster monitor its own health and its own consistency regards to the
recovery master. How and what the nodes monitor for differs between the node which is
the recovery master and normal nodes.
This monitoring it to ensure that the cluster is healthy and consistent.
This is not related to monitoring of inidividual node health, a.k.a. eventscript monitroing.
At the end of each step in the process are listed some of the most common and important
error messages that can be generated during that step.
NORMAL NODE CLUSTER MONITORING
------------------------------
Monitoring is performed in the dedicated recovery daemon process.
The implementation can be found in server/ctdb_recoverd.c:monitor_cluster()
This is an overview of the more important tasks during monitoring.
These tests are to verify that the local node is consistent with the recovery master.
Once every second the following monitoring loop is performed :
1, Verify that the parent ctdb daemon on the local node is still running.
If it is not, the recovery daemon logs an error and terminates.
"CTDB daemon is no longer available. Shutting down recovery daemon"
2, Check if any of the nodes has been recorded to have misbehaved too many times.
If so we ban the node and log a message :
"Node %u has caused %u failures in %.0f seconds - banning it for %u seconds"
3, Check that there is a recovery master.
If not we initiate a clusterwide election process and log :
"Initial recovery master set - forcing election"
and we restart monitoring from 1.
4, Verify that recovery daemon and the local ctdb daemon agreed on all the
node BANNING flags.
If the recovery daemon and the local ctdb daemon disagrees on these flags we update
the local ctdb daemon, logs one of two messages and restarts monitoring from 1 again.
"Local ctdb daemon on recmaster does not think this node is BANNED but the recovery master disagrees. Unbanning the node"
"Local ctdb daemon on non-recmaster does not think this node is BANNED but the recovery master disagrees. Re-banning the node"
5, Verify that the node designated to be recovery master exists in the local list of all nodes.
If the recovery master is not in the list of all cluster nodes a new recovery master
election is triggered and monitoring restarts from 1.
"Recmaster node %u not in list. Force reelection"
6, Check if the recovery master has become disconnected.
If is has, log an error message, force a new election and restart monitoring from 1.
"Recmaster node %u is disconnected. Force reelection"
7, Read the node flags off the recovery master and verify that it has not become banned.
If is has, log an error message, force a new election and restart monitoring from 1.
"Recmaster node %u no longer available. Force reelection"
8, Verify that the recmaster and the local node agrees on the flags (BANNED/DISABLED/...)
for the local node.
If there is an inconsistency, push the flags for the local node out to all other nodes.
"Recmaster disagrees on our flags flags:0x%x recmaster_flags:0x%x Broadcasting out flags."
9, Verify that the local node hosts all public ip addresses it should host and that it does
NOT host any public addresses it should not host.
If there is an inconsistency we log an error, trigger a recovery to occur and restart
monitoring from 1 again.
"Public address '%s' is missing and we should serve this ip"
"We are still serving a public address '%s' that we should not be serving."
These are all the checks we perform during monitoring for a normal node.
These tests are performed on all nodes in the cluster which is why it is optimized to perform
as few network calls to other nodes as possible.
Each node only performs 1 call to the recovery master in each loop and to no other nodes.