mirror of
https://github.com/samba-team/samba.git
synced 2025-01-12 09:18:10 +03:00
3777346629
(This used to be ctdb commit fd316deb8a9e0545c8efa1bfc8ad83962b310405)
137 lines
5.2 KiB
Plaintext
137 lines
5.2 KiB
Plaintext
This directory is where you should put any local or application
|
|
specific event scripts for ctdb to call.
|
|
|
|
All event scripts start with the prefic 'NN.' where N is a digit.
|
|
The event scripts are run in sequence based on NN.
|
|
Thus 10.interfaces will be run before 60.nfs.
|
|
|
|
Each NN must be unique and duplicates will cause undefined behaviour.
|
|
I.e. having both 10.interfaces and 10.otherstuff is not allowed.
|
|
|
|
|
|
As a special case, any eventscript that ends with a '~' character will be
|
|
ignored since this is a common postfix that some editors will append to
|
|
older versions of a file.
|
|
|
|
|
|
The eventscripts are called with varying number of arguments.
|
|
The first argument is the "event" and the rest of the arguments depend
|
|
on which event was triggered.
|
|
|
|
The events currently implemented are
|
|
startup
|
|
This event does not take any additional arguments.
|
|
This event is only invoked once, when ctdb is starting up.
|
|
This event is used to wait for the service to start and all
|
|
resources for the service becoming available.
|
|
|
|
This is used to prevent ctdb from starting up and advertize its
|
|
services until all dependent services have become available.
|
|
|
|
All services that are managed by ctdb should implement this
|
|
event and use it to start the service.
|
|
|
|
Example: 50.samba uses this event to start the samba daemon
|
|
and then wait until samba and all its associated services have
|
|
become available. It then also proceeds to wait until all
|
|
shares have become available.
|
|
|
|
shutdown
|
|
This event is called when the ctdb service is shuting down.
|
|
|
|
All services that are managed by ctdb should implement this event
|
|
and use it to perform a controlled shutdown of the service.
|
|
|
|
Example: 60.nfs uses this event to shut down nfs and all associated
|
|
services and stop exporting any shares when this event is invoked.
|
|
|
|
monitor
|
|
This event is invoked every X number of seconds.
|
|
The interval can be configured using the MonitorInterval tunable
|
|
but defaults to 15 seconds.
|
|
|
|
This event is triggered by ctdb to continously monitor that all
|
|
managed services are healthy.
|
|
When invoked, the event script will check that the service is healthy
|
|
and return 0 if so. If the service is not healthy the event script
|
|
should return non zero.
|
|
|
|
If a service returns nonzero from this script this will cause ctdb
|
|
to consider the node status as UNHEALTHY and will cause the public
|
|
address and all associated services to be failed over to a different
|
|
node in the cluster.
|
|
|
|
All managed services should implement this event.
|
|
|
|
Example: 10.interfaces which checks that the public interface (if used)
|
|
is healthy, i.e. it has a physical link established.
|
|
|
|
takeip
|
|
This event is triggered everytime the node takes over a public ip
|
|
address during recovery.
|
|
This event takes three additional arguments :
|
|
'interface' 'ipaddress' and 'netmask'
|
|
|
|
Before this event there will always be a 'startrecovery' event.
|
|
|
|
This event will always be followed by a 'recovered' event onse
|
|
all ipaddresses have been reassigned to new nodes and the ctdb database
|
|
has been recovered.
|
|
If multiple ip addresses are reassigned during recovery it is
|
|
possible to get several 'takeip' events followed by a single
|
|
'recovered' event.
|
|
|
|
Since there might involve substantial work for the service when an ip
|
|
address is taken over and since multiple ip addresses might be taken
|
|
over in a single recovery it is often best to only mark which addresses
|
|
are being taken over in this event and defer the actual work to
|
|
reconfigure or restart the services until the 'recovered' event.
|
|
|
|
Example: 60.nfs which just records which ip addresses are being taken
|
|
over into a local state directory and which defers the actual
|
|
restart of the services until the 'recovered' event.
|
|
|
|
|
|
releaseip
|
|
This event is triggered everytime the node releases a public ip
|
|
address during recovery.
|
|
This event takes three additional arguments :
|
|
'interface' 'ipaddress' and 'netmask'
|
|
|
|
In all other regards this event is analog to the 'takeip' event above.
|
|
|
|
Example: 60.nfs
|
|
|
|
startrecovery
|
|
This event is triggered everytime we start a recovery process
|
|
or before we start changing ip address allocations.
|
|
|
|
recovered
|
|
This event is triggered every time we have finished a full recovery
|
|
and also after we have finished reallocating the public ip addresses
|
|
across the cluster.
|
|
|
|
Example: 60.nfs which if the ip address configuration has changed
|
|
during the recovery (i.e. if addresses have been taken over or
|
|
released) will kill off any tcp connections that exist for that
|
|
service and also send out statd notifications to all registered
|
|
clients.
|
|
|
|
|
|
Additional note for takeip, releaseip, recovered:
|
|
|
|
ALL services that depend on the ip address configuration of the node must
|
|
implement all three of these events.
|
|
|
|
ALL services that use TCP should also implement these events and at least
|
|
kill off any tcp connections to the service if the ip address config has
|
|
changed in a similar fashion to how 60.nfs does it.
|
|
The reason one must do this is that ESTABLISHED tcp connections may survive
|
|
when an ip address is released and removed from the host until the ip address
|
|
is re-takenover.
|
|
Any tcp connections that survive a release/takeip sequence can potentially
|
|
cause the client/server tcp connection to get out of sync with sequence and
|
|
ack numbers and cause a disruptive ack storm.
|
|
|
|
|