IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
create a new debugging command xpnn which discovers the pnn of the local node and which works even if the local daemon is not running
(This used to be ctdb commit cd78765f9400d7abce7929a2dd199f65226e7664)
this command shows which eventscripts were executed during the last monitoring cycle and the status from each eventscript.
If an eventscript timedout or returned an error we also
show the output from the eventscript.
Example :
[root@rcn1 ctdb-git]# ./bin/ctdb scriptstatus
6 scripts were executed last monitoring cycle
00.ctdb Status:OK Duration:0.021 Mon Mar 23 19:04:32 2009
10.interface Status:OK Duration:0.048 Mon Mar 23 19:04:32 2009
20.multipathd Status:OK Duration:0.011 Mon Mar 23 19:04:33 2009
40.vsftpd Status:OK Duration:0.011 Mon Mar 23 19:04:33 2009
41.httpd Status:OK Duration:0.011 Mon Mar 23 19:04:33 2009
50.samba Status:ERROR Duration:0.057 Mon Mar 23 19:04:33 2009
OUTPUT:ERROR: Samba tcp port 445 is not responding
Add a new helper function "switch_from_server_to_client()" which both
the recovery daemon can use as well as in the child process we start for running the actual eventscripts.
Create several new controls, both for the eventscript child process to inform the master daemon of the current status of the scripts as well as for the ctdb tool to extract this information from the runninc daemon.
(This used to be ctdb commit c98f90ad61c9b1e679116fbed948ddca4111968d)
two new dedicated ctdb error codes
21: node does not exist
22: node is disconnected
(This used to be ctdb commit 7ee6db06162ad5a554058bb6160ad37b24fe42e0)
block and wait until the clustered has completed the recovery before returning.
this makes it easier to script since it avoids the common need for
ctdb recover
... complex loop to wait for recovery to complete ...
script continues
(This used to be ctdb commit 8a0df9324a03b0f17772c64a9331236126c22124)
If set this specified the maximum runtime for the ctdb tool before it will terminate with status == 20
Just like the -T ... option would.
(This used to be ctdb commit c404d57afb2adda039e676877838927d3073df11)
change the ban/unban logic to wait until we are not in recovery before it bans/unbans the node.
also wait until after the cluster has recovered from the ban/unban before returning so that the cluster is in recpovery mode == normal when the command returns. this makes it much easier to script things ...
(This used to be ctdb commit 39c77371a2f995025a584691fe61af12dc6ed5d7)
this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing.
(This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e)
modify the transport methods to allow to restart individual connections
and set up destructors properly.
only tear down/set-up tcp connections to nodes removed from the cluster
or nodes added to the cluster.
Leave tcp connections to unchanged nodes connected.
make "ctdb reloadnodes" explicitely cause a recovery of the cluster once
the files have been realoaded
(This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b)
"ctdb delip x.x.x.x -n all"
This is not as straightforward as one might think since during the
delete process we don not want the ip to be bouncing from one node to
another as node by node deletes it.
Thus we first delete the ip from all connected nodes which are not
currently hosting it.
After this we delete the ip from the node which is hosting it.
(This used to be ctdb commit bbd46f341e9aa32d8dbd49f7a9a07cb3f1f92ea3)
Encode the database name in the header so we dont need to provide the database
name when doing a restore
Encode a timestamp in the header telling us when the backup was created
(This used to be ctdb commit 77762170ad1dbc4620565bb898af5d493fac117d)
ctdb backupdb : which will copy a database out from ctdb and write it to a file
ctdb restoredb : which will read a database backup from a file and write it into ctdb
(This used to be ctdb commit b567e215f5c58d646a392408b9cc1df8ef029b33)
This file creates additional locking stress on the backend filesystem and we may not need it anyway.
(This used to be ctdb commit 84236e03e40bcf46fa634d106903277c149a734f)
lvs: which shows which nodes are active LVS servers
lvsmaster: which shows which node is the lvs master multiplex node
pnn: which prints the pnn of the local node
(This used to be ctdb commit 00025eef662b867293829228c681df491cd6f371)
make ctdb uptime print how long the recovery took
in the recovery daemon when we check that the public ip address
allocation on the local node is correct (we have the ips we should have
and we dont have any we shouldnt have) use ctdb uptime and check the
recovery start/stop times and make sure we dont check for ip allocation
inconsistencies during a recovery where the ip address allocation is in flux.
(This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429)
this callback is called for every node where the control failed (or timed out)
when we issue the start recovery control from recovery master,
set any node that fails as a culprit so it will eventually be banned
(This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2)
ctdb_attach() so that we can pass TDB_NOSYNC when we attach to
a persistent database and want fast unsafe writes instead of
slow but safe tdb_transaction writes.
enhance the ctdb_persistent test suite to test both safe and unsafe writes
(This used to be ctdb commit 4948574f5a290434f3edd0c052cf13f3645deec4)
This enhances the framework for sending tcp tickles to be able to send ipv6 tickles as well.
Since we can not use one single RAW socket to send both handcrafted ipv4 and ipv6 packets, instead of always opening TWO sockets, one ipv4 and one ipv6 we get rid of the helper ctdb_sys_open_sending_socket() and just open (and close) a raw socket of the appropriate type inside ctdb_sys_send_tcp().
We know which type of socket v4/v6 to use based on the sin_family of the destination address.
Since ctdb_sys_send_tcp() opens its own socket we no longer nede to pass a socket
descriptor as a parameter. Get rid of this redundant parameter and fixup all callers.
(This used to be ctdb commit 406a2a1e364cf71eb15e5aeec3b87c62f825da92)
This allows us to use the async framework also for controls that return
outdata.
Add a "capabilities" field to the ctdb_node structure. This field is
only initialized and kept valid inside the recovery daemon context and not
inside the main ctdb daemon.
change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable.
When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes.
when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap.
Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list)
(This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6)
Define two capabilities :
can be recmaster
can be lmaster
Default both capabilities to YES
Update the ctdb tool to read capabilities off a node
(This used to be ctdb commit 50f1255ea9ed15bb8fa11cf838b29afa77e857fd)
If no other node is hosting this public ip at the moment, then assign it immediately to the current node.
(This used to be ctdb commit a63825e32658b36e0964584758b9a276c18056b8)
this collects all public addresses from all nodes and presents the public ips
for the entire cluster
(This used to be ctdb commit cbf79b2158ab21a58aef967e89f0bd60890a7972)
this collects all public addresses from all nodes and presents the public ips
for the entire cluster
(This used to be ctdb commit 0a4e667f42c6fb23be13651f7b0d0a545a49900b)
and a ctdb command to pull the talloc memory map from a recovery daemon
ctdb rddumpmemory
(This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05)
The controls only modify the runtime setting of which public addresses a node
can server and does not modify /etc/ctdb/public_addresses.
To make the change permanent you also need to edit /etc/ctdb/public_addresses
manually.
After ip addresses have been added/deleted you need to invoke a recovery
for the ip addresses to be redistributed.
(This used to be ctdb commit f8294d103fdd8a720d0b0c337d3973c7fdf76b5c)
Add back the controls to enable/disable monitoring we used to have for debugging but removed a while ago
(This used to be ctdb commit 8477f6a079e2beb8c09c19702733c4e17f5032fe)
ctdb moveip <IPADDRESS> <NODE>
which can be used to manually fail an ip address over to a specific node.
This can only be used if DeteministicIPs are disabled and also only if NoIPFailback is enabled.
(This used to be ctdb commit ffee062b7e26a6aa6ad254edb58399040ecaa542)
add a new control that causes the node to drop the current nodes list
and reread it from the nodes file.
During this operation, the node will also drop the tcp layer and restart it.
When we drop the tcp layer, by talloc_free()ing the ctcp structure
add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer
add two new commands for the ctdb tool.
one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file
(This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)
memory tree to stdout. This is much more useful than putting it in the log, and also fixes
a bug where the pipe would overflow internally and cause ctdbd to lockup
(This used to be ctdb commit e236979e2162d9bd7a495086342168a696cf76c5)
ctdb vacuum : vacuums all the databases, deleting any zero length
ctdb records
ctdb repack : repacks all the databases, resulting in a perfectly
packed database with no freelist entries
(This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4)
ctdb_recoverd.c
Always handle banning/unbanning locally on the node that is being
banned/unbanned instead of on the recovery master.
This means that if a ban request comes in to the recovery master for a
remote node, we pass the request on to the remote node instead of
setting up the ban and ban timeouts locally.
ctdb.c
send ban/unban requests to the node being banned/unbanned instead of to
the recmaster
(This used to be ctdb commit 880dd9f5fd0b91e450da93e195cc5c62cb1dcd6e)
monitoring should always be enabled
(though a node may want to temporarily disable running the "monitor"
event scripts but can do so internally without the need for this
control)
(This used to be ctdb commit e3a33618026823e6af845fd8513cddb08e6b5584)
the recmaster or not.
return 0 if the node is the recmaster and 1 (true) if it is not or if
we could not communicate with the ctdb daemon.
call it 'isnotrecmaster' to cope with that if the tool could not bind to
the socket to tyalk to the daemon, the tool will automatically return an
error and exit code 1
thus the tool will only return 0 if it could talk successfully to the
local daemon and if the local daemon confirms this node is the recmaster
(This used to be ctdb commit ae5fcb790b6c3985f514fa8a96bc00c2619f2a28)
and merge into a big list since with the deassociation between a node
and a public ipaddress the /etc/ctdb/public_addresses files can
differ between nodes and no node know about all public addresses that a
cluster can use
(This used to be ctdb commit e208294fed183977cacc44b2cd1195c11d967c18)
this command no longer makes sense when there is no on-to-one mapping
between a node and its default public ip
(This used to be ctdb commit 91280db7f6dd3d659edd86fae21ba347d6f9da9e)
multiple public addresses spread across multiple interfaces on each
node.
this is a massive patch since we have previously made the assumtion that
we only have one public address per node.
get rid of the public_interface argument. the public addresses file
now explicitely lists which interface the address belongs to
(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
controls to register/unregister/check a server id.
a server id consists of TYPE:VNN:ID where type is specific to the
application. VNN is the node where the serverid was registered and ID
might be a node unique identifier such as a pid or similar.
Clients can register a server id for themself at the local ctdb daemon.
When a client dissappears or when the domain socket connection for the
client drops then any and all server ids registered across that domain
socket will also be automatically removed from the store.
clients can register as many server_ids as they want at the same time
but each TYPE:VNN:ID must be globally unique.
Clients have the option of explicitely unregister a server id by using
the UNREGISTER control.
Registration and unregistration can only be done by clients to the local
daemon. clients can not register their server id to a remote node.
clients can check if a server id does exist on any ctdb node in the
network by using the check control
(This used to be ctdb commit d44798feec26147c5cc05922cb2186f0ef0307be)
callback function which is called upon completion (or timeout) of the
control.
modify scanning of recmaster in the monitoring_cluster code to try the
api out
(This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c)
struct so that if we timeout a control we can print debug info such as
what opcode failed and to which node
we dont need the *status parameter to ctdb_client_control_state
create async versions of the getrecmaster control
pass a memory context to getrecmaster
(This used to be ctdb commit 558b680c82f830fba82c283c78c2de8a0b150b75)
places.
create a new helper function to generate new generation id values that
know about the invalid id and avoids generating it.
update the ctdb status tool to know about the invalid generation id and
print the string INVALID instead
(This used to be ctdb commit 4fbcd189543cb8a92227fdcd3d158472e558ccda)
there is an array for each node/public address that contains tcp tickles
we send a TCP_ADD as a broadcast to all nodes when a client is added
if tcp tickles are removed, they are only removed immediately from the
local node.
once every 20 seconds a node will push/broadcast out the tickle list for
all public addresses it manages. this will remove any deleted tickles
from the remote nodes
(This used to be ctdb commit e3c432a915222e1392d91835bc7a73a96ab61ac9)
let the caller create the sending socket and use a single socket instead
of one new one for each tickle.
pass a sending socket to ctdb_sys_send_tcp()
ctdb_sys_kill_tcp is not longer used so remove it
set the socketflags for close on exec and nonblocking in the helper that
creates the sockets instead of in the caller
add a helper to create a sending socket to send tickles from
(This used to be ctdb commit 469f3fb238a0674a2b48fdf1a7e657e32428178a)
the tcp connection
change the 60.nfs script to run ctdb killtcp in the foreground so we
dont get lots of these running in parallel when there are a lot of tcp
connections to rst
(This used to be ctdb commit d81616214752882242f2886e94681972a790db80)
- split out the event script code into a separate module
- get rid of the separate takeover directory
(This used to be ctdb commit 8ea2c923a3e2464200ff79bf2c3f1f89e6a93ad4)