IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
ctdb_recoverd.c
Always handle banning/unbanning locally on the node that is being
banned/unbanned instead of on the recovery master.
This means that if a ban request comes in to the recovery master for a
remote node, we pass the request on to the remote node instead of
setting up the ban and ban timeouts locally.
ctdb.c
send ban/unban requests to the node being banned/unbanned instead of to
the recmaster
(This used to be ctdb commit 880dd9f5fd0b91e450da93e195cc5c62cb1dcd6e)
the banned_nodes array and not the rec structure so that ban_state is
destroyed when the banned_nodes array gets destroyed
(and so that when this struct is destroyed, that any pending
ctdb_ban_timeout events are also destroyed.)
othervise we may end up with multiple ban_timeout timed events going in
parallell since we destroy/recreate the banned_nodes structure during
election but we never destroy/recreate the rec structure.
(This used to be ctdb commit fbd663d56a2a4421a5c0e541962c87e2e9c7cd82)
flag.
change calling of the recovered/takeip/releaseip event scripts to use
these enable/disable functions instead of stopping/starting monitoring.
when we disable monitoring we want all events to still be running
in particular the events to monitor for dead nodes and we only want to
supress running the monitor event scripts
(This used to be ctdb commit a006dcc4f75aba950dd701ad7d1a84e89df285e8)
monitoring should always be enabled
(though a node may want to temporarily disable running the "monitor"
event scripts but can do so internally without the need for this
control)
(This used to be ctdb commit e3a33618026823e6af845fd8513cddb08e6b5584)
Check whether monitoring is enabled or not before creating new events
and log why the event is not set up othervise
(This used to be ctdb commit 2f352b2606c04a65ce461fc2e99e6d6251ac4f20)
control, instead call ctdb_start/stop_monitoring()
ctdb_stop_monitoring() dont allocate a new monitoring context, leave it
NULL. Also set the monitoring_mode in this function so that
ctdb_stop/start_monitoring() and ->monitoring_mode are kept in sync.
Add a debug message to log that we have stopped monitoring.
ctdb_start_monitoring() check whether monitoring is already active and
make the function idempotent.
Create the monitoring context when monitoring is started.
Update ->monitoring_mode once the monitoring has been started.
Add a debug message to log that we have started monitoring.
When we temporarily stop monitoring while running an event script,
restart monitoring after the event script wrapper returns instead of in
the event script callback.
Let monitoring_mode start out as DISABLED and let it be enabled once we call ctdb_start_monitoring.
dont check for MONITORING_DISABLED in check_fore_dead_nodes(). If
monitoring is disabled, this event handler will not be called.
(This used to be ctdb commit 3a93ae8bdcffb1adbd6243844f3058fc742f76aa)
when we are the recmaster and we update the local flags for all the
nodes, if one of the nodes fail to respond and give us his flags,
set that node as a "culprit"
as one of the first things to do in the monitor_cluster loop, check if
the current culprit has caused too many (20) failures and if so ban that
node.
this is for the situation where a remote node may still be CONNECTED but
it fails to respond to the getnodemap control causing the recovery
master to loop in monitor_cluster aborting the monitoring when the
node fails to respond but before anything will trigger a call to
do_recovery().
If one or more of the databases or nodes are frozen at this stage, this
would lead to smbd being blocked for potentially a longish time.
(This used to be ctdb commit 83b0261f2cb453195b86f547d360400103a8b795)
specific instance of ctdbd should bind to. This helps when running a
"virtual" cluster on a single machine where all instcances bind to
different alias interfaces.
If --node-ip is specified, then we will only try to bind to this ip
address only. Othervise we fall back to the original method trying the
ip addresses in /etc/ctdb/nodes one by one until we find one we can bind
to.
No variable in /etc/sysconfig/ctdb added since this parameter only makes
sense in a virtual test/debug cluster.
(This used to be ctdb commit d96cb02c2c24f9eabbc53d3d38e90dea49cff3e0)
recovery daemon and the ctdb daemon both agree on whether the node is
banned or not and if they disagree then reban the node again after
logging an error to the debug log
(This used to be ctdb commit 6cd6e534493066edd4bb2c6ae5be0e9a9d495aa0)
when these functions are called to ban or unban a node make sure we
update the CTDB_NODE_BANNED flag in rec->node_flags since this field and
flag are checked during the election process
(This used to be ctdb commit 740c632ae96a2d34327d1b575780aaf079d93f4f)
so it differs from what the local ctdb daemon on the recovery master
thinks it should be we should call for a re-election
(This used to be ctdb commit 21ad6039c31ef5cc0e40a35a41220f91943947cb)
flags differ between the local ctdb daemon and the remote node
we can force a flags update on all nodes and not just the local daemon
(This used to be ctdb commit a924eb89c966ecbae029ca137e06cffd40cc70fd)
flags
in update_local_flags()
(this is only called if we are or we belive we are the recmaster)
when we detect that the flags of a remote node is different from what
our local node thinks the flags should be for that remote node
we should send a node-flag-changed message to the local daemon so
that it updates the flags for that node.
(This used to be ctdb commit 36225e4e271f7a4065398253747fb20054f99a53)
when monitoring that all nfs shares are available, allow both ' ' and
'\t' characters to separate the exported directory from the options
in /etc/exports
(This used to be ctdb commit ac6cfe9de0acdcf9461068684fa890504454aae4)
of the startup event scripts after the point where recovery has
started and the node is in normal operation
This makes the 'startup' script just a special type of the 'monitor'
script which is called first
(This used to be ctdb commit 7424c30a5fd04aea0137c466b4318c3f185280d8)
check if mountd is running during monitoring and if it is not, try to restart it
(This used to be ctdb commit 3d4b74669164b519398aeeacd59714f1e3884eff)
mainly useful for avoiding ack-storms when doing very rapid
failover/failback during testing but should not be required in
real-world.
this gets rid of a lof of annoying messages from the messages file
(This used to be ctdb commit 50d289dcce2caa7c7be9b6faa3b38b69c2237038)
shut down and restart the transport
othervise, if we use the tcp transport the tcp connection might try to
retransmit the queued data during the time the node is unavailable.
this together with the exponential backoff for tcp means that the tcp
connection quickly reaches the maximum backoff rto which is often 60 or
120 seconds. this would mean that it could take up to 60/120 seconds
before the tcp layer detects that the connection is dead and it has to
be reestablished.
(This used to be ctdb commit 0256db470879ce556b0f00070f7ebeaf37e529ab)
recovery mode back to NORMAL that we can not lock the reclock file
since at this stage it MUST be locked by the recovery daemon.
in order to avoid a non-blocking fnctl() lock from blocking and cause
"issues" we move the 'test that we can not lock reclock file' into a
child process.
(This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e)
public addresses to nodes deterministic.
Activate it by adding CTDB_SET_DeterministicIPs=1 in /etc/sysconfig/ctdb
When this is set, the first entry in /etc/ctdb/public_addresses will
always be hosted by node 0, when that node is available, the second
entry by node1 and so on.
This tunable allows the allocation of addresses to become very
unbalanced and is only for debugging/testing use.
Beware, this feature requires that /etc/ctdb/public_addresses are
identical on all the nodes in the cluster.
(This used to be ctdb commit f0ca221f235731542090d8a6c86f2b7cd2ce2f96)
eventhough we dont want a blocking lock it does appear that the fcntl()
call can block for a while if gpfs is in the process of rebuilding
itself after a node arriving/leaving the cluster
(This used to be ctdb commit 6c0d206dea7116db71bccb4802a93dd7283249f6)
make sure we read and update the flags from all remote nodes before we
reach the first codepath that can call do_recovery()
since during do_recovery() we need to know what the flags are.
(This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f)
used in single public ip address mode.
when using this argument, --public-interface must also be used.
add a vnn structure to the ctdb context to describe the single public ip
address
update the killtcp control in the daemon that if a socketpair that is to
be killed does not match a normal public address it checks if the
destination address maches the single public ip address and if so uses
that vnn structure from the ctdb context
this allows killtcp to kill also connections to the single public ip
instead of only normal public addresses
(This used to be ctdb commit 5661ba17b91f62821dec1c76056c78b99752a90b)
to have one single public ip address for the entire cluster.
this ip address is attached to lo on all nodes but only the recmaster
will respond to arp requests for this address.
the recmaster then runs an ipmux process that will pass any incoming
packets to this ip address onto the other node sin the cluster based on
the ip address of the client host
to use this feature one must
1, have one fixed ip address in the customers network attached
permanently attached to an interface
2, set CTDB_PUBLI_INTERFACE=
to specify on which interface the clients attach to the node
3, CTDB_SINGLE_PUBLI_IP=ip-address
to specify which ipaddress should be the "single public ip address"
to test with only one single client, attach several ip addresses to
the client and ping the public address from the client with different -I
options. look in network trace to see to which node the packet is
passed onto.
(This used to be ctdb commit 50d648c95e4e6d7c2867a034c2b550086d853320)
the recmaster or not.
return 0 if the node is the recmaster and 1 (true) if it is not or if
we could not communicate with the ctdb daemon.
call it 'isnotrecmaster' to cope with that if the tool could not bind to
the socket to tyalk to the daemon, the tool will automatically return an
error and exit code 1
thus the tool will only return 0 if it could talk successfully to the
local daemon and if the local daemon confirms this node is the recmaster
(This used to be ctdb commit ae5fcb790b6c3985f514fa8a96bc00c2619f2a28)
shouldnt or we are not holding addresses wqe should)
we must first freeze the local node before we set the recovery mode
(This used to be ctdb commit a77a77e8b5180f6a4a1f3d7d4ff03811f3b71b56)
set the node initially unhealthy and let the status monitoring bring the node online.
This fixes a problem with winbindd, where it refused to start because secrets.tdb was not populated
but we could not populate ctdbd, because the net command would not run while ctdbd was still doing startup
and thus frozen
(This used to be ctdb commit 3a001b793dd76fb96addf1e2ccb74da326fbcfbc)
c to prevent it from being immediately freed (and our persistent store
state with it) if we need to wait asynchronously for other nodes before
we can reply back to the client
(This used to be ctdb commit fa5915280933e4d2e7d4d07199829c9c2b87a335)
nodes so that the db is created on them as well
when we send this broadcast we must use the correct control and not
assume all databases created are of the temporary kind
(This used to be ctdb commit 106f816d4a0814ca4418de051289d9fc62df7dd2)
it is --event-script-dir not --event-script
add explanation of the public_addresses file
(This used to be ctdb commit 21325b23e786ac1c2abc07ea75b0814e9c725a9e)
addresses (i.e. htey hold those they should hold and they dont hold
any of those they shouldnt hold)
if an inconsistency is found, mark the local node as recovery mode
active
and wait for the recovery master to trigger a full blown recovery
(This used to be ctdb commit 55a5bfc8244c5b9cdda3f11992f384f00566b5dc)
- add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring
(This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b)
need_takeover_run is set to true or else we might forget to rerun it
again during the next recovery
othervise, need_takeover_run is only set to true IFF the node flags for
a remote node and the local nodes differ.
It is possible that a takeover run fails and thus the reassignment of
ip addresses is incomplete but before we get back to the test in
monitor_cluster() that all the node flags of all nodes have converged
and they now match each others again. and thus causing
monitor_cluster() to fail to realize that a takeover run is needed.
(This used to be ctdb commit ae7e866787cebd14394983ce1834387c959d1022)
- put removed IPs on loopback with scope host
- check for nul strings in ethtool call
;
(This used to be ctdb commit e2df1d6d08e67a36ff05a590a34c56e900741287)
a bool that specifies whether the ip was held by a loopback adaptor or
not
the name of the interface where the ip was held
when we release an ip address from an interface, move the ip address
over to the loopback interface
when we release an ip address after we have move it onto loopback,
use 60.nfs to kill off the server side (the local part) of the tcp
connection so that the tcp connections dont survive a
failover/failback
61.nfstickle, since we kill hte tcp connections when we release an ip
address we no longer need to restart the nfs service in 61.nfstickle
update ctdb_takeover to use the new signature for ctdb_sys_have_ip
when we add a tcp connection to kill in ctdb_killtcp_add_connection()
check if either the srouce or destination address match a known public
address
(This used to be ctdb commit f9fd2a4719c50f6b8e01d0a1b3a74b76b52ecaf3)
10.interfaces startsup
this setting makes the system only respond to APR requests from the NIC
where the ip address is tied to and adds to the
"principle of least surprise" when using multihoming servers
(This used to be ctdb commit 39ddf347dc45f599964a4c17e67e71faed00e544)
and merge into a big list since with the deassociation between a node
and a public ipaddress the /etc/ctdb/public_addresses files can
differ between nodes and no node know about all public addresses that a
cluster can use
(This used to be ctdb commit e208294fed183977cacc44b2cd1195c11d967c18)
this command no longer makes sense when there is no on-to-one mapping
between a node and its default public ip
(This used to be ctdb commit 91280db7f6dd3d659edd86fae21ba347d6f9da9e)
we must always restart the lockmanager when the cluster has been
reconfigured and ip addresses has changed. This is to make sure we get a
clusterwide grace period for nfs locking.
if we dont do this and only restart locking on the nodes that were
direclty affected, a different client can take out a conflicting lock
from a different node before affected clients has had a chance to
reclaim all the locks lost during reconfigure.
grace period on rhel5 kernel has bene increased to 90 seconds!
statd-callout:
we must restart lockmanager to ensure a clusterwide grace period for
nfs. this makes locking "more correct" for nfs clients and prevents
other clients/nodes from taking out a conflicting lock while a different
client/node tries to reclaim lost locks.
This makes it "almost consistent" for NFS clients but there is still
the possibility that a cifs client can take out a conflicting lock
before an nfs client has had a chance to reclaim an existing lock.
This can not be solved with anything less than making the kernel nfs
lock manager "samba aware" and making samba aware of the internal state
of the kernel lock manager so that they can cooperate.
we can not just stop/start the lockmanager back to back in rhel5 since
if they are stopped/started too close to eachother then when the new
lockmanager upon starting up sends out statd notifications two things
can happen:
1, new lockmanager sends out notification BEFORE it has registered with
portmapper leading to
lockmanager starts
lockmanager sends notification to the client
client tries to recover the lock and tries to portmap the lockmanager
port on the server.
server is not (yet) registered with portmapper and server responds
"no such program" to hte clients request to discover where lockmanager
is.
client then just completely gives up reclaiming the lock and doesnt
even reattempt the portmapper call after some timeout.
==> lock reclaim failed.
2, if they are started back to back, and a client tries to reclaim the
lock the lockmanager sometimes sends two responses back to back
to the client. one with status NLM_GRANTED (==you got the lock
reclaimed) and one with status NLM_DENIED (==you could not get the lock
reclaimed)
This confuses the client and leads to the server thinking that the
client does have the lock and the client thinking it has not got the
lock and orphaned locks result.
We also send out additional notification messages of different formats
to allow more legacy clients to interoperate with locking.
(This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033)
files
so that we can partition the cluster into different subsets of nodes
which each serve a different subset of the public addresses
(This used to be ctdb commit 889e0fe69e4c88c6166282b12843b8d9727552d6)
everytime we release an ip.
this context is used to hold all resources needed when sending out
gratious arps and tcp tickles during ip takeover.
we hang it off the vnn structure that manages that particular ip address
instead so that we can have multiple ones going in parallell
this bug (or the same bug in different shape) has probably been in ctdb
for very very long but is likely to be hard to trigger
(This used to be ctdb commit c58db1cadaba253b2659573673b28c235ef7db76)
multiple public addresses spread across multiple interfaces on each
node.
this is a massive patch since we have previously made the assumtion that
we only have one public address per node.
get rid of the public_interface argument. the public addresses file
now explicitely lists which interface the address belongs to
(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)