IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
ctdb_attach() so that we can pass TDB_NOSYNC when we attach to
a persistent database and want fast unsafe writes instead of
slow but safe tdb_transaction writes.
enhance the ctdb_persistent test suite to test both safe and unsafe writes
(This used to be ctdb commit 4948574f5a290434f3edd0c052cf13f3645deec4)
since these two controls (UPDATE_RECORD and PERSISTENT_STORE) can respond
asynchronously to the control, we can not allocate the state variable as a child off ctdb_req_control instead we must allocate state as a child off ctdb itself
and steal ctdb_req_control so it becomes a child of state.
othervise both ctdb_req_control and also state will be released immediately after we have finished setting up the async reply and returned.
(This used to be ctdb commit 6f6de0becd179be9eb9a6bf70562b090205ce196)
With these patches, ctdbd will enforce and (by default) always use
tdb_transactions when updating/writing records to a persistent database.
This might come with a small performance degratation since transactions
are slower than no transactions at all.
If a client, such as samba wants to use a persistent database but does NOT
want to pay the performance penalty, it can specify TDB_NOSYNC as the
srvid parameter in the ctdb_control() for CTDB_CONTROL_DB_ATTACH_PERSISTENT.
In this case CTDBD will remember that "this database is not that important"
so I can use unsafe (no transaction) tdb_stores to write the updates.
It will be faster than the default (always use transaction) but less crash safe.
(This used to be ctdb commit 3d85d2cf669686f89cacdc481eaa97aef1ba62c0)
for stores into persistent databases, ALWAYS use a lockwait child take out the lock for the record and never the daemon itself.
(This used to be ctdb commit 7fb6cf549de1b5e9ac5a3e4483c7591850ea2464)
the address might be a public address on a different node so no need to fiull up the logs with thoise messages
(This used to be ctdb commit c8181476748395fe6ec5284c49e9d37b882d15ea)
just disable the monitoring during the "startrecovery" event and enable it again once recovery has completed
(This used to be ctdb commit 68029894f80804c9f31fc90ed0c1b58f75812c3d)
since this event wont run unless the recovery mode is normal but we
can not know what the recovery mode will be in the future on a remote node
so since we issue these commands that will execute in the future at some other node
it is pointless to try to check if it worked or not
in particular if "failure to successfully run the eventscript" would then trigger a full new recovery which is disruptive and expensive.
(This used to be ctdb commit 2c292039a0139dcf5bb2bd964eb6f8902d094c50)
This attempts to fix the problem of ctdb event scripts blocking due to
attempted access to the ctdb databases during recovery. The changes are:
- now only the 'shutdown' and 'startrecovery' events can be called
with the databases locked in recovery. The event scripts must ensure
that for these two events no database access is attempted
- the recovered, takeip and releaseip events could previously be called
inside a recovery. The code now ensures that this doesn't happen, delaying
the events till after recovery has finished
- the 50.samba event script now avoids using testparm unless it is really
needed
This needs extensive testing.
(This used to be ctdb commit e3cdb8f2be6a44ec877efcd75c7297edb008a80b)
This enhances the framework for sending tcp tickles to be able to send ipv6 tickles as well.
Since we can not use one single RAW socket to send both handcrafted ipv4 and ipv6 packets, instead of always opening TWO sockets, one ipv4 and one ipv6 we get rid of the helper ctdb_sys_open_sending_socket() and just open (and close) a raw socket of the appropriate type inside ctdb_sys_send_tcp().
We know which type of socket v4/v6 to use based on the sin_family of the destination address.
Since ctdb_sys_send_tcp() opens its own socket we no longer nede to pass a socket
descriptor as a parameter. Get rid of this redundant parameter and fixup all callers.
(This used to be ctdb commit 406a2a1e364cf71eb15e5aeec3b87c62f825da92)
If a transaction could be started, do safe transaction store when updating the record inside the daemon.
If the transaction could not be started (maybe another samba process has a lock on the database?) then just do a normal store instead (instead of blocking the ctdb daemon).
The client can "signal" ctdb that updates to this database should, if possible, be done using safe transactions by specifying the TDB_NOSYNC flag when attaching to the database.
The TDB flags are passed to ctdb in the "srvid" field of the control header when attaching using the CTDB_CONTROL_DB_ATTACH_PERSISTENT.
Currently, samba3.2 does not yet tell ctdbd to handle any persistent databases using safe transactions.
If samba3.2 wants a particular persistent database to be handled using
safe transactions inside the ctdbd daemon, it should pass
TDB_NOSYNC as the flags to the call to attach to a persistent database
in ctdbd_db_attach() it currently specifies 0 as the srvid
(This used to be ctdb commit 8d6ecf47318188448d934ab76e40da7e4cece67d)
If we shutdown the transport and CTDB later decides to send a command out
for queueing, the call to ctdb->methods->allocate_pkt() will SEGV.
This could trigger for example when we are in the process of shuttind down CTDBD and have already shutdown the transport but we are still waiting for the
"shutdown" eventscripts to finish.
If the event scripts now take much much longer to execute for some reason, this
race condition becomes much more probable.
Decorate all dereferencing of ctdb->methods-> with a check that ctdb->menthods is non-NULL
(This used to be ctdb commit c4c2c53918da6fb566d6e9cbd6b02e61ae2921e7)
Remove a bogus check inside the recovery daemon that ONLY redistributed public addresses IFF the local node had/served public addresses.
This was a valid optimization long ago when we enforced that all nodes must use the same public addresses file but is invalid today where we can have different public addresses configs on all nodes and even have some nodes that do NOT use public addresses at all.
(This used to be ctdb commit 5833e6b99d9afaf35dc8354df8676b9115418b23)
Should always use type safe talloc functions when possible. In this case we were allocating bytes instead of uint32_t
(This used to be ctdb commit cb14ee57dd0a589242da1ac2830bb7939df460a5)
This allows us to use the async framework also for controls that return
outdata.
Add a "capabilities" field to the ctdb_node structure. This field is
only initialized and kept valid inside the recovery daemon context and not
inside the main ctdb daemon.
change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable.
When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes.
when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap.
Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list)
(This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6)
handle failure to get/hold the reclock pnn file better and just
treat it as a transient backend filesystem error and try again later
instead of shutting down the recovery daemon
when we have lost the pnn file and if we are recmaster
release the recmaster role so that someone else can become recmaster isntead
(This used to be ctdb commit e513277fb09b951427be8351d04c877e0a15359d)
Define two capabilities :
can be recmaster
can be lmaster
Default both capabilities to YES
Update the ctdb tool to read capabilities off a node
(This used to be ctdb commit 50f1255ea9ed15bb8fa11cf838b29afa77e857fd)
remove the transaction stuff and push so that the git tree will work
This reverts commit 539bbdd9b0d0346b42e66ef2fcfb16f39bbe098b.
(This used to be ctdb commit 876d3aca18c27c2239116c8feb6582b3a68c6571)
thus allowing the client to pass through the TDB_NOSYNC flag
- ensure that tdb_store() operations on persistent databases that don't
have TDB_NOSYNC set happen inside a transaction wrapper, thus making
them crash safe
(This used to be ctdb commit 49330f97c78ca0669615297ac3d8498651831214)
Add support in AIX to track the PID of a client that connects to the unix domain socket
(This used to be ctdb commit 4c006c675d577d4a45f4db2929af6d50bc28dd9e)
and a ctdb command to pull the talloc memory map from a recovery daemon
ctdb rddumpmemory
(This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05)
The controls only modify the runtime setting of which public addresses a node
can server and does not modify /etc/ctdb/public_addresses.
To make the change permanent you also need to edit /etc/ctdb/public_addresses
manually.
After ip addresses have been added/deleted you need to invoke a recovery
for the ip addresses to be redistributed.
(This used to be ctdb commit f8294d103fdd8a720d0b0c337d3973c7fdf76b5c)
Add back the controls to enable/disable monitoring we used to have for debugging but removed a while ago
(This used to be ctdb commit 8477f6a079e2beb8c09c19702733c4e17f5032fe)
This can cause a memory leak if the call is terminated before we have managed to respond to the client.
(and the call is talloc_free()d but the data is still hanging off ctdb)
instead we must talloc_steal() the data and hang it off the call structure to avoid the memory leak.
In order to do this we must also change the call structure that is passed into ctdb_call_local() to be allocated through talloc().
This structure was previously either a static variable, or an element of a larger talloc()ed structure (ctdb_call_state or ctdb_client_call_state) so
we must change all creations of a ctdb_call into explicitely creating it through talloc()
(This used to be ctdb commit 4becf32aea088a25686e8bc330eb47d85ae0ef8f)
Vacumming used to delete one record at a time on all nodes, that was
m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all.
The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted.
(This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53)
when this tunable is set, ip addresses will only be failed over when a node
fails. And only those ip addresses held by the failed node will be reallocated
in the cluster.
When a node becomes active again, this will not lead to any failback of ip addresses.
This can reduce the number of "ip address movements" in the cluster since we dont automatically fail an ip address back, but can also lead to an unbalanced cluster since we no longer attempt to spread the ip addresses out evenly across the active nodes.
This tuneable can NOT be active at the same time as DeterministicIPs are used.
(This used to be ctdb commit d3b8a461b15bc584fa1785eb5922de6d49d8f6c4)
a node that has been allocated to server an ip actually CAN serve that ip
(if we use differing public_addresses files on each node)
(This used to be ctdb commit fdaf7cb2d7682507fbf4c6c2b833b327c93fac08)
of connected nodes
num_active only contains the number of active nodes and would thus not count
banned nodes
(This used to be ctdb commit 06d3ce470766ef0b60d68ccd84de5437146cc147)
once every such interval :
* the recovery master on each node will uppdate the "connected" count in the
reclock count file (ctdb getreclock)
* if the node thinks it is a recovery master but it detects another node
that is DISCONNECTED but which still holds a lock to the reclock count file
this may mean that we have a split cluster.
if that other node that is DISCONNECTED but still holds the lock on hte reclock
pnn count file, is MORE connected than the local node,
yield the recmaster role and let the other half of the lcuster take over
this add a second, last chance mechanism to detect split clusters.
IF the cluster is split but GPFS is not yet split, this mechanism makes
the largest half of the cluster become the active half.
(This used to be ctdb commit 07af425f444531942cce8abff112c1524228d287)
CTDB_START_AS_DISABLED="yes"
and command line argument
--start-as-disabled
When set, this makes the ctdb node to always start in DISABLED mode and will thus not host any public ip addresses.
The administrator must manually "ctdb enable" the node after it has started when the administrator wants the node to start hosting public ip addresses.
Using this option it is possible to start ctdb on a node without causing any reallocation of ip addresses when it is starting. The node will still merge with the cluster and there will still be a recovery phase but the ip address allocations will not change in the cluster.
(This used to be ctdb commit b93d29f43f5306c244c887b54a77bca8a061daf2)
add a new control that causes the node to drop the current nodes list
and reread it from the nodes file.
During this operation, the node will also drop the tcp layer and restart it.
When we drop the tcp layer, by talloc_free()ing the ctcp structure
add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer
add two new commands for the ctdb tool.
one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file
(This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)
memory tree to stdout. This is much more useful than putting it in the log, and also fixes
a bug where the pipe would overflow internally and cause ctdbd to lockup
(This used to be ctdb commit e236979e2162d9bd7a495086342168a696cf76c5)
nodes into two separate files.
move the monitoring of keepalives for detecting connected/disconnected
remote nodes into ctdb_keepalive.c
(This used to be ctdb commit 23a57b20c314d5f11a433cf251eb9d9de743849a)
ctdb vacuum : vacuums all the databases, deleting any zero length
ctdb records
ctdb repack : repacks all the databases, resulting in a perfectly
packed database with no freelist entries
(This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4)
ctdb_recoverd.c
Always handle banning/unbanning locally on the node that is being
banned/unbanned instead of on the recovery master.
This means that if a ban request comes in to the recovery master for a
remote node, we pass the request on to the remote node instead of
setting up the ban and ban timeouts locally.
ctdb.c
send ban/unban requests to the node being banned/unbanned instead of to
the recmaster
(This used to be ctdb commit 880dd9f5fd0b91e450da93e195cc5c62cb1dcd6e)
the banned_nodes array and not the rec structure so that ban_state is
destroyed when the banned_nodes array gets destroyed
(and so that when this struct is destroyed, that any pending
ctdb_ban_timeout events are also destroyed.)
othervise we may end up with multiple ban_timeout timed events going in
parallell since we destroy/recreate the banned_nodes structure during
election but we never destroy/recreate the rec structure.
(This used to be ctdb commit fbd663d56a2a4421a5c0e541962c87e2e9c7cd82)
flag.
change calling of the recovered/takeip/releaseip event scripts to use
these enable/disable functions instead of stopping/starting monitoring.
when we disable monitoring we want all events to still be running
in particular the events to monitor for dead nodes and we only want to
supress running the monitor event scripts
(This used to be ctdb commit a006dcc4f75aba950dd701ad7d1a84e89df285e8)
monitoring should always be enabled
(though a node may want to temporarily disable running the "monitor"
event scripts but can do so internally without the need for this
control)
(This used to be ctdb commit e3a33618026823e6af845fd8513cddb08e6b5584)
Check whether monitoring is enabled or not before creating new events
and log why the event is not set up othervise
(This used to be ctdb commit 2f352b2606c04a65ce461fc2e99e6d6251ac4f20)
control, instead call ctdb_start/stop_monitoring()
ctdb_stop_monitoring() dont allocate a new monitoring context, leave it
NULL. Also set the monitoring_mode in this function so that
ctdb_stop/start_monitoring() and ->monitoring_mode are kept in sync.
Add a debug message to log that we have stopped monitoring.
ctdb_start_monitoring() check whether monitoring is already active and
make the function idempotent.
Create the monitoring context when monitoring is started.
Update ->monitoring_mode once the monitoring has been started.
Add a debug message to log that we have started monitoring.
When we temporarily stop monitoring while running an event script,
restart monitoring after the event script wrapper returns instead of in
the event script callback.
Let monitoring_mode start out as DISABLED and let it be enabled once we call ctdb_start_monitoring.
dont check for MONITORING_DISABLED in check_fore_dead_nodes(). If
monitoring is disabled, this event handler will not be called.
(This used to be ctdb commit 3a93ae8bdcffb1adbd6243844f3058fc742f76aa)
when we are the recmaster and we update the local flags for all the
nodes, if one of the nodes fail to respond and give us his flags,
set that node as a "culprit"
as one of the first things to do in the monitor_cluster loop, check if
the current culprit has caused too many (20) failures and if so ban that
node.
this is for the situation where a remote node may still be CONNECTED but
it fails to respond to the getnodemap control causing the recovery
master to loop in monitor_cluster aborting the monitoring when the
node fails to respond but before anything will trigger a call to
do_recovery().
If one or more of the databases or nodes are frozen at this stage, this
would lead to smbd being blocked for potentially a longish time.
(This used to be ctdb commit 83b0261f2cb453195b86f547d360400103a8b795)
specific instance of ctdbd should bind to. This helps when running a
"virtual" cluster on a single machine where all instcances bind to
different alias interfaces.
If --node-ip is specified, then we will only try to bind to this ip
address only. Othervise we fall back to the original method trying the
ip addresses in /etc/ctdb/nodes one by one until we find one we can bind
to.
No variable in /etc/sysconfig/ctdb added since this parameter only makes
sense in a virtual test/debug cluster.
(This used to be ctdb commit d96cb02c2c24f9eabbc53d3d38e90dea49cff3e0)
recovery daemon and the ctdb daemon both agree on whether the node is
banned or not and if they disagree then reban the node again after
logging an error to the debug log
(This used to be ctdb commit 6cd6e534493066edd4bb2c6ae5be0e9a9d495aa0)
when these functions are called to ban or unban a node make sure we
update the CTDB_NODE_BANNED flag in rec->node_flags since this field and
flag are checked during the election process
(This used to be ctdb commit 740c632ae96a2d34327d1b575780aaf079d93f4f)
so it differs from what the local ctdb daemon on the recovery master
thinks it should be we should call for a re-election
(This used to be ctdb commit 21ad6039c31ef5cc0e40a35a41220f91943947cb)
flags differ between the local ctdb daemon and the remote node
we can force a flags update on all nodes and not just the local daemon
(This used to be ctdb commit a924eb89c966ecbae029ca137e06cffd40cc70fd)
flags
in update_local_flags()
(this is only called if we are or we belive we are the recmaster)
when we detect that the flags of a remote node is different from what
our local node thinks the flags should be for that remote node
we should send a node-flag-changed message to the local daemon so
that it updates the flags for that node.
(This used to be ctdb commit 36225e4e271f7a4065398253747fb20054f99a53)
of the startup event scripts after the point where recovery has
started and the node is in normal operation
This makes the 'startup' script just a special type of the 'monitor'
script which is called first
(This used to be ctdb commit 7424c30a5fd04aea0137c466b4318c3f185280d8)
shut down and restart the transport
othervise, if we use the tcp transport the tcp connection might try to
retransmit the queued data during the time the node is unavailable.
this together with the exponential backoff for tcp means that the tcp
connection quickly reaches the maximum backoff rto which is often 60 or
120 seconds. this would mean that it could take up to 60/120 seconds
before the tcp layer detects that the connection is dead and it has to
be reestablished.
(This used to be ctdb commit 0256db470879ce556b0f00070f7ebeaf37e529ab)
recovery mode back to NORMAL that we can not lock the reclock file
since at this stage it MUST be locked by the recovery daemon.
in order to avoid a non-blocking fnctl() lock from blocking and cause
"issues" we move the 'test that we can not lock reclock file' into a
child process.
(This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e)
public addresses to nodes deterministic.
Activate it by adding CTDB_SET_DeterministicIPs=1 in /etc/sysconfig/ctdb
When this is set, the first entry in /etc/ctdb/public_addresses will
always be hosted by node 0, when that node is available, the second
entry by node1 and so on.
This tunable allows the allocation of addresses to become very
unbalanced and is only for debugging/testing use.
Beware, this feature requires that /etc/ctdb/public_addresses are
identical on all the nodes in the cluster.
(This used to be ctdb commit f0ca221f235731542090d8a6c86f2b7cd2ce2f96)
eventhough we dont want a blocking lock it does appear that the fcntl()
call can block for a while if gpfs is in the process of rebuilding
itself after a node arriving/leaving the cluster
(This used to be ctdb commit 6c0d206dea7116db71bccb4802a93dd7283249f6)
make sure we read and update the flags from all remote nodes before we
reach the first codepath that can call do_recovery()
since during do_recovery() we need to know what the flags are.
(This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f)
used in single public ip address mode.
when using this argument, --public-interface must also be used.
add a vnn structure to the ctdb context to describe the single public ip
address
update the killtcp control in the daemon that if a socketpair that is to
be killed does not match a normal public address it checks if the
destination address maches the single public ip address and if so uses
that vnn structure from the ctdb context
this allows killtcp to kill also connections to the single public ip
instead of only normal public addresses
(This used to be ctdb commit 5661ba17b91f62821dec1c76056c78b99752a90b)
shouldnt or we are not holding addresses wqe should)
we must first freeze the local node before we set the recovery mode
(This used to be ctdb commit a77a77e8b5180f6a4a1f3d7d4ff03811f3b71b56)
set the node initially unhealthy and let the status monitoring bring the node online.
This fixes a problem with winbindd, where it refused to start because secrets.tdb was not populated
but we could not populate ctdbd, because the net command would not run while ctdbd was still doing startup
and thus frozen
(This used to be ctdb commit 3a001b793dd76fb96addf1e2ccb74da326fbcfbc)
c to prevent it from being immediately freed (and our persistent store
state with it) if we need to wait asynchronously for other nodes before
we can reply back to the client
(This used to be ctdb commit fa5915280933e4d2e7d4d07199829c9c2b87a335)
nodes so that the db is created on them as well
when we send this broadcast we must use the correct control and not
assume all databases created are of the temporary kind
(This used to be ctdb commit 106f816d4a0814ca4418de051289d9fc62df7dd2)
addresses (i.e. htey hold those they should hold and they dont hold
any of those they shouldnt hold)
if an inconsistency is found, mark the local node as recovery mode
active
and wait for the recovery master to trigger a full blown recovery
(This used to be ctdb commit 55a5bfc8244c5b9cdda3f11992f384f00566b5dc)
- add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring
(This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b)
need_takeover_run is set to true or else we might forget to rerun it
again during the next recovery
othervise, need_takeover_run is only set to true IFF the node flags for
a remote node and the local nodes differ.
It is possible that a takeover run fails and thus the reassignment of
ip addresses is incomplete but before we get back to the test in
monitor_cluster() that all the node flags of all nodes have converged
and they now match each others again. and thus causing
monitor_cluster() to fail to realize that a takeover run is needed.
(This used to be ctdb commit ae7e866787cebd14394983ce1834387c959d1022)
a bool that specifies whether the ip was held by a loopback adaptor or
not
the name of the interface where the ip was held
when we release an ip address from an interface, move the ip address
over to the loopback interface
when we release an ip address after we have move it onto loopback,
use 60.nfs to kill off the server side (the local part) of the tcp
connection so that the tcp connections dont survive a
failover/failback
61.nfstickle, since we kill hte tcp connections when we release an ip
address we no longer need to restart the nfs service in 61.nfstickle
update ctdb_takeover to use the new signature for ctdb_sys_have_ip
when we add a tcp connection to kill in ctdb_killtcp_add_connection()
check if either the srouce or destination address match a known public
address
(This used to be ctdb commit f9fd2a4719c50f6b8e01d0a1b3a74b76b52ecaf3)
files
so that we can partition the cluster into different subsets of nodes
which each serve a different subset of the public addresses
(This used to be ctdb commit 889e0fe69e4c88c6166282b12843b8d9727552d6)
everytime we release an ip.
this context is used to hold all resources needed when sending out
gratious arps and tcp tickles during ip takeover.
we hang it off the vnn structure that manages that particular ip address
instead so that we can have multiple ones going in parallell
this bug (or the same bug in different shape) has probably been in ctdb
for very very long but is likely to be hard to trigger
(This used to be ctdb commit c58db1cadaba253b2659573673b28c235ef7db76)
multiple public addresses spread across multiple interfaces on each
node.
this is a massive patch since we have previously made the assumtion that
we only have one public address per node.
get rid of the public_interface argument. the public addresses file
now explicitely lists which interface the address belongs to
(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
controls to register/unregister/check a server id.
a server id consists of TYPE:VNN:ID where type is specific to the
application. VNN is the node where the serverid was registered and ID
might be a node unique identifier such as a pid or similar.
Clients can register a server id for themself at the local ctdb daemon.
When a client dissappears or when the domain socket connection for the
client drops then any and all server ids registered across that domain
socket will also be automatically removed from the store.
clients can register as many server_ids as they want at the same time
but each TYPE:VNN:ID must be globally unique.
Clients have the option of explicitely unregister a server id by using
the UNREGISTER control.
Registration and unregistration can only be done by clients to the local
daemon. clients can not register their server id to a remote node.
clients can check if a server id does exist on any ctdb node in the
network by using the check control
(This used to be ctdb commit d44798feec26147c5cc05922cb2186f0ef0307be)
passing it as a parameter we set the callback function explicitely from
the caller if the ..._send() function returned a valid state pointer.
(This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42)
callback function which is called upon completion (or timeout) of the
control.
modify scanning of recmaster in the monitoring_cluster code to try the
api out
(This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c)
struct so that if we timeout a control we can print debug info such as
what opcode failed and to which node
we dont need the *status parameter to ctdb_client_control_state
create async versions of the getrecmaster control
pass a memory context to getrecmaster
(This used to be ctdb commit 558b680c82f830fba82c283c78c2de8a0b150b75)
node is not banned it the call is for a database record. i.e a REQ/REPLY
CALL/DMASTER
if we get such a call while banned, ignore the packet and write an entry
in the logfile
(This used to be ctdb commit 79eb0863609fbb12e28ebf734101b1d3f359b330)
places.
create a new helper function to generate new generation id values that
know about the invalid id and avoids generating it.
update the ctdb status tool to know about the invalid generation id and
print the string INVALID instead
(This used to be ctdb commit 4fbcd189543cb8a92227fdcd3d158472e558ccda)
not participating in the cluster
if a client tries to attach to a database while the node is inactive,
return an error back to the client and fail the attach
(This used to be ctdb commit b26949f3c8e54f3bc60da04d7b4ac69f301068fc)
and it should thus no longer serve any database access calls until it
has been reintroduced into the cluster.
when becoming banned, reset the local generation id to 1 to prevent
any further database access calls from other nodes from being processed.
(This used to be ctdb commit b531021db43ebaa5f5d0ace28c59913d359bd8a8)
see both the old flags as well as the new flags (so we can tell which
flags changed)
send the CTDB_SRVID_RECONFIGURE messages to connected nodes only, not to
every node, connected or not, in the cluster.
in the handler inside the recovery daemon which is invoked for node flag
change messages, only do a takeover_run() and redistribute the ip addresses IF it was the
disabled or the unhealthy flags that changed. Also send out the cluster
reconfigured message in this case.
If any of the other flags changed we dont need to do the takeover_run(0
here since that will be done during recovery.
(This used to be ctdb commit 5549b2058e2c148a8ca9d419123acf3247bb8829)
from the administrator, log this as 'Received SHUTDOWN command. Stopping
CTDB daemon.' so that the administrator will know when looking at the
log 'why' the ctdb service was terminated.
Previously the only thing logged was 'shutting down' which is not
detailed enough.
(This used to be ctdb commit 5b818c1b72b6594a8d6e45e1865026e3ce33ae63)
change initial errors that cause ctdb to fail to start from printf to
DEBUG(0
add a DEBUG(0 to log that the ctdb service is starting
(This used to be ctdb commit 680b4fbb283dd68567a62a83345f11a6cc1dd0e5)
public address remain at that node until either the node becomes
unhealthy or the original/primary node for that address becomes healthy
again.
Othervise what will happen is
1, if we ban a node, the banning code immediately does a
takeover_run() and reassigns the public address to a different node in
the cluster.
2, a few seconds later (at most) the recovery daemon will detect that
the number of nodes has shrunk and will initiate a recovery.
During the recovery the public address would again be assigned to a
node, this time a different node.
(This used to be ctdb commit 30a6b7a648e22873d8ce6289a3d6dc42c4b9e3b3)
specific script /etc/ctdb/events.d/00.ctdb
get rid of CTDB_EVENTS_SCRIPT and --event-script
(This used to be ctdb commit 81ccfaf838e5772d4a58eb6a70224b7b39aba9f3)
instead for from /etc/ctdb/events so that we can get better debugging
output in the logs when something fails in the scripts
(This used to be ctdb commit 4ed96b768aea1611e8002f7095d3c4d12ccf77a3)
tcp connection in the tree that stores the tcp connections to kill by
sending an RST
add a define that specified the keylength instead of hardcoding it as 4
(This used to be ctdb commit 6a8322cbae10f2c78b2e286c75aeb25ece12ea7f)
we store in the tree and use a node destructor so that when the data is
talloc_free()d we also remove the node from the tree.
(This used to be ctdb commit b8dabd1811ebd85ee031563e95085f720a2fa04d)
description (src + dst sockaddr_in) in a linked list.
everytime we receive a captured packet from the network we had to walk
this list in linear time to see if the packet matched a connection we
wanted to RST.
which wouldnt scale very well.
replace the linked list with a redblack tree that is indexed by
src address, src port, dst address, dst port
to make checking whether the packet belongs to a connection we want to
RST very fast and scalable
the reason we need to capture packets when we want to kill a TCP
connection is because we must wait for an ACK coming back from the
remote host so that we can learn which sequence number to use in the
RST.
Most tcp today will ingore any and all RST segments unless the
sequencenumber lies exactly on the right edge of the window to make
spoofing RST a little bit more difficult.
(This used to be ctdb commit ced18caea8582af042287beb6333dd1f8ba3344d)
update addr to the source address so the rpintout in the log matches
the client that attached to samba
(This used to be ctdb commit 72098b71c79469c86769ca82bbd484c81902d27c)
by a talloc_steal()
use the returned pointer in talloc_steal as the value to assign
(This used to be ctdb commit 5c6375ad3bbecfa725ec3b1477f259e5a8191866)
tickles) just talloc_steal the enture tcp_array into the arp
structure instead of copying each of the entries into a linked list
and then releasing the tcparray.
(This used to be ctdb commit 468e237740cf37a65872ef700bbb1284ede8352a)
tickled all connections
othervise the other nodes will still remember this list until next time
we have had a connection/client closing.
(This used to be ctdb commit cb8e5d4bbee2f14f498735489f673ff3679dfd9d)
there is an array for each node/public address that contains tcp tickles
we send a TCP_ADD as a broadcast to all nodes when a client is added
if tcp tickles are removed, they are only removed immediately from the
local node.
once every 20 seconds a node will push/broadcast out the tickle list for
all public addresses it manages. this will remove any deleted tickles
from the remote nodes
(This used to be ctdb commit e3c432a915222e1392d91835bc7a73a96ab61ac9)
ip/node
once we have started sending all tickles for a specific ip delete the
entire array so that the tickles dont remain forever in the ctdb
server
add a control to send the full list of every tickle that is registered
for a particular public ip/node
(This used to be ctdb commit d0eee33e44d3f8e26debbec21d41e2cbdbb520e6)
back to 0 if it is to prevent an infinite loop.
this could happen if in the future we add a mechanism to add/remove
nodes to a cluster at runtime
(This used to be ctdb commit 217e80a468713fec86ccb0608460e3401046bb98)
to keep a static that controls at which noide to start searching the
list for takeover candidates next time we need to find a node.
each time we find a node to takeover, reset the start variable to point
to the next node in the list
this makes the distribution of takeover nodes much more even
(This used to be ctdb commit e9800df5a21079ea478d16f7dd2fd4707de85650)
specific routines populate it as it see fit when creating a
capture socket.
pass this structure to read_tcp and close capture socket as parameter
(This used to be ctdb commit 79bbfcfb2223889126fe307d5bbfd24917da07ee)
let the caller create the sending socket and use a single socket instead
of one new one for each tickle.
pass a sending socket to ctdb_sys_send_tcp()
ctdb_sys_kill_tcp is not longer used so remove it
set the socketflags for close on exec and nonblocking in the helper that
creates the sockets instead of in the caller
add a helper to create a sending socket to send tickles from
(This used to be ctdb commit 469f3fb238a0674a2b48fdf1a7e657e32428178a)
we might want to have two sockets attached to the killtcp structure
one for capturing and a second one for sending so we dont have to
create a new socket for each tickle we want to send
(This used to be ctdb commit b3e82ec38047bbec1edfd88ade264077d4cbd2ee)
this allows us to print from which node Invalid or Dropped orphan become
dmaster packets came from
(This used to be ctdb commit 88efd1bf4c796cd2b184156b72296587bc38bb40)
dont let those messages modify the DISCONNECTED flag.
the DISCONNECTED flag must be managed locally since it describes whether
the local node can communicate with the remote node or not
(This used to be ctdb commit 5650673205d335a32d4f27f66847ea66752a00f0)
cluster, we cant check that both the BANNED and the DISCONNECTED flags
are both set at the same time since if a node becomes banned just
before it is DISCONNECTED there is no guarantee that all other nodes
will have seen the BANNED flag.
So we must first check the DISCONNECTED flag only and only if the
DISCONNECTED flag is not set should we check the BANNED flag.
othervise this can cause a recovery loop while some nodes thing the
disconnected node is DISCONNECTED|BANNED and other think it is just
DISCONNECTED
(This used to be ctdb commit 0967b2fff376ead631d98e78b3a97253fc109c69)
- split out the event script code into a separate module
- get rid of the separate takeover directory
(This used to be ctdb commit 8ea2c923a3e2464200ff79bf2c3f1f89e6a93ad4)
- added DatabaseHashSize tunable
- added logging of events inside recovery (for timing)
(This used to be ctdb commit 3593cdb928b91e217faf1b3c537fa28dc82cdace)