1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-07 17:18:11 +03:00
Commit Graph

979 Commits

Author SHA1 Message Date
Ronnie Sahlberg
d6736b3720 we allocated one byte too little in the blob we need to send as the control to the server.
(This used to be ctdb commit 10e585413c217d9b9c32ff3d2fb3d8f24183c458)
2008-04-03 16:35:23 +11:00
Ronnie Sahlberg
e8e67ef576 add a mechanism to force a node to run the eventscripts with arbitrary arguments
ctdb eventscript "command argument argument ..."

(This used to be ctdb commit 118a16e763d8332c6ce4d8b8e194775fb874c8c8)
2008-04-02 11:13:30 +11:00
Ronnie Sahlberg
27a7f854f5 add improvements to tracking memory usage in ctdbd adn the recovery daemon
and a ctdb command to pull the talloc memory map from a recovery daemon
ctdb rddumpmemory

(This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05)
2008-04-01 15:34:54 +11:00
Ronnie Sahlberg
a1334246cf make sure the iface string is nullterminated in the addip control packet
(This used to be ctdb commit 983490556bc12fe03de4c22b5fdc12d15c11d43c)
2008-03-31 12:49:39 +11:00
Ronnie Sahlberg
0d7b34c9e5 Add two new controls to add/delete public ip address from a node at runtime.
The controls only modify the runtime setting of which public addresses a node
can server and does not modify /etc/ctdb/public_addresses.
To make the change permanent you also need to edit /etc/ctdb/public_addresses
manually.

After ip addresses have been added/deleted you need to invoke a recovery
for the ip addresses to be redistributed.

(This used to be ctdb commit f8294d103fdd8a720d0b0c337d3973c7fdf76b5c)
2008-03-27 09:23:27 +11:00
Ronnie Sahlberg
2863d2cfd1 From M Dietz,
Add back the controls to enable/disable monitoring we used to have for debugging but removed a while ago

(This used to be ctdb commit 8477f6a079e2beb8c09c19702733c4e17f5032fe)
2008-03-25 08:27:38 +11:00
Ronnie Sahlberg
74d57f8d51 Redo the vacukming process to mkake it scalable.
Vacumming used to delete one record at a time on all nodes, that was
m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all.

The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted.

(This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53)
2008-03-13 07:53:29 +11:00
Ronnie Sahlberg
b1cf2b5653 Update ctdb uptime to provide machinereadable output
(This used to be ctdb commit 4f7f8aa6f178115b551ac35f7df2ec5aad054fe2)
2008-03-04 13:29:48 +11:00
Ronnie Sahlberg
61b52e0e64 provide machinereadble -Y output for 'ctdb getdebug'
(This used to be ctdb commit 646f4d9a01637685e967fb3ecc042fc97c0b7529)
2008-03-04 13:23:06 +11:00
Ronnie Sahlberg
212fbb42d5 make 'ctdb ip' provide machinereadble output using '-Y'
(This used to be ctdb commit 446e2f4e650b12d6fce5677a6841006462c23dba)
2008-03-04 13:18:27 +11:00
Ronnie Sahlberg
d9b534b59d A new command to 'ctdb'
ctdb moveip <IPADDRESS> <NODE>

which can be used to manually fail an ip address over to a specific node.

This can only be used if DeteministicIPs are disabled and also only if NoIPFailback is enabled.

(This used to be ctdb commit ffee062b7e26a6aa6ad254edb58399040ecaa542)
2008-03-04 12:20:23 +11:00
Ronnie Sahlberg
e0036942bc add a new file <reclock>.pnn where each recovery daemon can lock that byte at offset==pnn to offer an alternative way to detect which nodes are active instead of relying on CONNECTED being accurate.
(This used to be ctdb commit 21d3319eaf463e2a00637d440ee2d4d15f53bf09)
2008-02-29 12:37:42 +11:00
Ronnie Sahlberg
4adeafef11 add a control to get the name of the reclock file from the daemon
(This used to be ctdb commit 9effb22cc1616d684352d7ebabb359e69adb0f52)
2008-02-29 10:03:39 +11:00
Ronnie Sahlberg
16c4e9c4aa make the ctdb reloadnodes reload the nodes file on all nodes and restart the transport
(This used to be ctdb commit 6272ad33b4af6ea9d6fd0ac877df3f75be45d665)
2008-02-21 08:25:01 +11:00
Ronnie Sahlberg
9f99b44fd1 to make it easier/less disruptive to add nodes to a running cluster
add a new control that causes the node to drop the current nodes list
and reread it from the nodes file.
During this operation, the node will also drop the tcp layer and restart it.

When we drop the tcp layer, by talloc_free()ing the ctcp structure
add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer

add two new commands for the ctdb tool.
one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file

(This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)
2008-02-19 14:44:48 +11:00
Andrew Tridgell
275cd68867 nicer use of structures and use isalpha()
(This used to be ctdb commit 19b5fbcd16596a4b6c22056585dd4bd988db3db7)
2008-02-05 10:36:06 +11:00
Ronnie Sahlberg
3f56526037 Specify and print debuglevels by name and not by number
(This used to be ctdb commit 79ad830294b8b677fbd0c5ad7ed6fbde71f74f8d)
2008-02-05 10:26:23 +11:00
Andrew Tridgell
f6e53f433b merge from ronnie
(This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)
2008-02-04 20:07:15 +11:00
Andrew Tridgell
146d4b0db7 merge async recovery changes from Ronnie
(This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2)
2008-01-29 13:59:28 +11:00
Andrew Tridgell
3777346629 partial merge from ronnie
(This used to be ctdb commit fd316deb8a9e0545c8efa1bfc8ad83962b310405)
2008-01-29 11:39:06 +11:00
Andrew Tridgell
eb044bb1d6 make ctdb dumpmemory work remotely, and dump the talloc
memory tree to stdout. This is much more useful than putting it in the log, and also fixes
a bug where the pipe would overflow internally and cause ctdbd to lockup

(This used to be ctdb commit e236979e2162d9bd7a495086342168a696cf76c5)
2008-01-22 14:22:41 +11:00
Ronnie Sahlberg
9055978b46 add a ctdb uptime command that prints when ctdb was started and when the
last recovery occured

(This used to be ctdb commit b86e8ccbdac044bb949c4fc2ebb27635126272a9)
2008-01-17 11:33:23 +11:00
Ronnie Sahlberg
5b7838d768 ctdb_control_send() does not need to take an outdata parameter
remove the outdata parameter from the function and all callers

(This used to be ctdb commit e3951337f8df2ae19cce61c954036590c7a03582)
2008-01-16 10:23:26 +11:00
Andrew Tridgell
3b3fceacbe block alarm signals during critical sections of vacuum
(This used to be ctdb commit cfb14ae76f00f10d27b56c034b2247ab12d63065)
2008-01-10 09:43:14 +11:00
Andrew Tridgell
2119f0a66c add a max runtime switch to ctdb tool
(This used to be ctdb commit b681e4f2011481aebbe18fd0147c2d500caf2705)
2008-01-10 08:04:54 +11:00
Andrew Tridgell
5d9913642f allow remote variable expansion in onnode, so you can use wildcards that expand on the remote nodes
(This used to be ctdb commit def643225a1cb31d4999f3e73fad368ae60048ad)
2008-01-09 15:04:56 +11:00
Andrew Tridgell
bb3f77d61d changed default vacuum limit
(This used to be ctdb commit 7ca2977c12cf7938da639a17a0f857d7029d749c)
2008-01-09 08:28:18 +11:00
Andrew Tridgell
673a2b46f9 nicer outut from repack and vacuum
(This used to be ctdb commit 446c76bc332fe1366c32898fb77279a902d7159c)
2008-01-08 23:02:43 +11:00
Andrew Tridgell
0ee375ad66 this is not an error - it just means the record was busy
(This used to be ctdb commit 749451a4e97330d0fc35f5366dcc61aa500f7ce9)
2008-01-08 22:36:44 +11:00
Andrew Tridgell
1c91398aef ensure the recovery daemon is not clagged up by vacuum calls
(This used to be ctdb commit ff7e80e247bf5a86adda0ef850d901478449675b)
2008-01-08 21:28:42 +11:00
Andrew Tridgell
96100fcae6 added two new ctdb commands:
ctdb vacuum   : vacuums all the databases, deleting any zero length
                 ctdb records

 ctdb repack   : repacks all the databases, resulting in a perfectly
                 packed database with no freelist entries

(This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4)
2008-01-08 17:23:27 +11:00
Andrew Tridgell
d38fbaa38b nicer onnode output
(This used to be ctdb commit ac5c1e090d007bc2e3965589731620b87c0217fb)
2008-01-07 14:31:13 +11:00
Ronnie Sahlberg
7cef33b40a rework banning/unbanning nodes
ctdb_recoverd.c
Always handle banning/unbanning locally on the node that is being 
banned/unbanned instead of on the recovery master.
This means that if a ban request comes in to the recovery master for a 
remote node, we pass the request on to the remote node instead of 
setting up the ban and ban timeouts locally.

ctdb.c
send ban/unban requests to the node being banned/unbanned instead of to 
the recmaster

(This used to be ctdb commit 880dd9f5fd0b91e450da93e195cc5c62cb1dcd6e)
2007-12-03 15:45:53 +11:00
Ronnie Sahlberg
0eb6c04dc1 get rid of the control to set the monitoring mode.
monitoring should always be enabled
(though a node may want to temporarily disable running the "monitor"
event scripts but can do so internally without the need for this 
control)

(This used to be ctdb commit e3a33618026823e6af845fd8513cddb08e6b5584)
2007-11-30 10:00:04 +11:00
Andrew Tridgell
684282f7a1 added bonding info to ctdb_diagnostics
(This used to be ctdb commit 71b5fc434bc5d88eb0669ee29aa932ba12737e07)
2007-10-30 10:18:52 +11:00
Andrew Tridgell
6e6de1e4b7 fixed a problem with backgrounding onnnode
(This used to be ctdb commit 4e23630224bb219cfbbf129c4562da5a4c2d601a)
2007-10-22 21:11:02 +10:00
Ronnie Sahlberg
9a93f4b8df reverse the order in which public ips are listed so it matches the order
of the public_addresses file

(This used to be ctdb commit ce987661edd9160982e65866fb773445d296e5c7)
2007-10-17 13:42:42 +10:00
Andrew Tridgell
6b9d73a96d more detail on multipath config
(This used to be ctdb commit 78c44f2267cbef5fbc57d56dfd5ff40972733a1f)
2007-10-16 20:13:28 +10:00
Ronnie Sahlberg
80cd82f8e4 add a control to send gratious arps from the ctdb daemon
(This used to be ctdb commit 563819dd1acb344f95aabb4bad990b36f7ea4520)
2007-10-09 11:56:09 +10:00
Ronnie Sahlberg
ab5d098bf6 add a function in the ctdb tool to determine whether the local node is
the recmaster or not.

return 0 if the node is the recmaster and 1 (true) if it is not or if 
we could not communicate with the ctdb daemon.


call it 'isnotrecmaster' to cope with that if the tool could not bind to 
the socket to tyalk to the daemon, the tool will automatically return an 
error and exit code 1
thus the tool will only return 0 if it could talk successfully to the 
local daemon and if the local daemon confirms this node is the recmaster

(This used to be ctdb commit ae5fcb790b6c3985f514fa8a96bc00c2619f2a28)
2007-10-08 09:47:20 +10:00
Andrew Tridgell
c60988325d added support for persistent databases in ctdbd
(This used to be ctdb commit 3115090a0d882beca9d70761130b74bb0821f201)
2007-09-21 12:24:02 +10:00
Andrew Tridgell
0438b07b53 separate out the various fs display ops
(This used to be ctdb commit dc89e1a428da5d5ca2a9c4988c05de3ea65f00f4)
2007-09-19 11:46:11 +10:00
Andrew Tridgell
bd7eeebe16 expanded ctdb_diagnostics a bit
(This used to be ctdb commit 70a4bb3dc7e624ad778949dbc874c2617fd532e6)
2007-09-17 15:31:33 +10:00
Andrew Tridgell
ed75f988d5 merge from ronnie
(This used to be ctdb commit 913c33a7d2f67570548fecc568dba874e5f72dd2)
2007-09-14 15:23:23 +10:00
Ronnie Sahlberg
2d0261afeb let ctdb ip only print the ip addresses known to the specified node
and not the entire cluster

(This used to be ctdb commit eb1f67a56d752c9f42a9a26a6697a7ab8e668b3a)
2007-09-14 15:19:44 +10:00
Andrew Tridgell
023b885793 new approach for killing TCP connections on IP release
(This used to be ctdb commit c33a0db29b5604966f582b1f8c5fd66760c72197)
2007-09-13 10:24:48 +10:00
Andrew Tridgell
a919f6927a fixed return code
(This used to be ctdb commit 30165b5a19f9bd9d1f62c9c222df0711c1c6a927)
2007-09-13 10:02:56 +10:00
Andrew Tridgell
42168177ef merge from ronnie
(This used to be ctdb commit 1f21d4d563232926c35d03c4d69eb69190823dc6)
2007-09-10 13:21:11 +10:00
Andrew Tridgell
f3927719c9 add crontab and sysctl output
(This used to be ctdb commit b1b59f3294ee7a5ed6d685f373bf19d3152170fa)
2007-09-10 11:27:07 +10:00
Ronnie Sahlberg
d91b28f8b7 ctdb ip must loop over all connected nodes to pull hte public ip list
and merge into a big list   since with the deassociation between a node 
and a public ipaddress    the /etc/ctdb/public_addresses files can 
differ between nodes and no node know about all public addresses that a 
cluster can use

(This used to be ctdb commit e208294fed183977cacc44b2cd1195c11d967c18)
2007-09-07 16:45:19 +10:00
Ronnie Sahlberg
3cad21d6be remove the ctdb publicip command
this command no longer makes sense when there is no on-to-one mapping 
between a node and its default public ip

(This used to be ctdb commit 91280db7f6dd3d659edd86fae21ba347d6f9da9e)
2007-09-07 15:39:26 +10:00
Ronnie Sahlberg
68c37f9b41 merge from tridge
(This used to be ctdb commit 58c918b1bfe09c31049769dee266129cbad4cb20)
2007-09-07 09:21:40 +10:00
Andrew Tridgell
c572d3c226 added a diagnostics tool for ctdb
(This used to be ctdb commit 032a2238caf688656b00e06bf363182368e037e1)
2007-09-05 14:20:34 +10:00
Ronnie Sahlberg
157be530dd change ctdb_ctrl_getvnn to ctdb_ctrl_getpnn
(This used to be ctdb commit ef47cc4cd416065c69382e4d9e76c30a0a34e42f)
2007-09-04 10:38:48 +10:00
Ronnie Sahlberg
211b497818 change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn
change ctdb_ban_info.vnn to ctdb_ban_info.pnn

(This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a)
2007-09-04 10:33:10 +10:00
Ronnie Sahlberg
6f693bbcbd change server_id.vnn to server_id.pnn
(This used to be ctdb commit 26f2ee2b754a9271454412f05111a19b3013c6eb)
2007-09-04 10:21:51 +10:00
Ronnie Sahlberg
4ba9990143 change vnn to pnn in the ctdb tool
(This used to be ctdb commit 822556a4d4ba23459be3a25cbd3f48d1f64ba95f)
2007-09-04 10:14:41 +10:00
Ronnie Sahlberg
12ebb74838 change how we do public addresses and takeover so that we can have
multiple public addresses spread across multiple interfaces on each 
node.

this is a massive patch since we have previously made the assumtion that 
we only have one public address per node.

get rid of the public_interface argument.  the public addresses file 
now explicitely lists which interface the address belongs to

(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
2007-09-04 09:50:07 +10:00
Ronnie Sahlberg
801bdbdc80 add a control to pull the server id list off a node
(This used to be ctdb commit 38aa759aa88a042c31b401551f6a713fb7bbe84e)
2007-08-26 10:57:02 +10:00
Ronnie Sahlberg
6681da31df add an initial implementation of a service_id structure and three
controls to  register/unregister/check a server id.

a server id consists of TYPE:VNN:ID    where type is specific to the 
application.  VNN is the node where the serverid was registered and ID 
might be a node unique identifier such as a pid or similar.


Clients can register a server id for themself at the local ctdb daemon.
When a client dissappears   or when the domain socket connection for the 
client drops  then any and all server ids registered across that domain 
socket will also be automatically removed from the store.

clients can register as many server_ids as they want at the same time    
but each TYPE:VNN:ID must be globally unique.

Clients have the option of explicitely unregister a server id by using 
the UNREGISTER control.


Registration and unregistration can only be done by clients to the local 
daemon. clients can not register their server id to a remote node.


clients can check if a server id does exist on any ctdb node in the 
network by using the check control

(This used to be ctdb commit d44798feec26147c5cc05922cb2186f0ef0307be)
2007-08-24 15:53:41 +10:00
Ronnie Sahlberg
f854b5f876 try out a slightly different api for controls where you provide a
callback function which is called upon completion (or timeout) of the 
control.

modify scanning of recmaster in the monitoring_cluster code to try the 
api out

(This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c)
2007-08-23 19:27:09 +10:00
Ronnie Sahlberg
8fd3df2553 hang the ctdb_req_control structure off the ctdb_client_control_state
struct  so that if we timeout a control we can print debug info such as 
what opcode failed and to which node

we dont need the *status parameter to ctdb_client_control_state

create async versions of the getrecmaster control

pass a memory context to getrecmaster

(This used to be ctdb commit 558b680c82f830fba82c283c78c2de8a0b150b75)
2007-08-23 13:00:10 +10:00
Ronnie Sahlberg
f6e0336b23 create a define to represent the 'invalid' generation id we used in two
places.

create a new helper function to generate new generation id values that 
know about the invalid id and avoids generating it.

update the ctdb status tool to know about the invalid generation id and 
print the string INVALID instead

(This used to be ctdb commit 4fbcd189543cb8a92227fdcd3d158472e558ccda)
2007-08-22 12:38:31 +10:00
Ronnie Sahlberg
26d3cd38a9 change fprintf(stderr to DEBUG(0, now that client DEBUGs are redirected
to stderr

(This used to be ctdb commit 14078130d295014a751f3e0039bc8eaf427440f9)
2007-08-08 10:19:42 +10:00
Andrew Tridgell
8a81a03b9e merge from ronnie
(This used to be ctdb commit e06f70f064e39f1a4a394f00b81b6b1d215534d4)
2007-08-07 13:40:13 +10:00
Ronnie Sahlberg
d69055b789 change error output in ctdb and in ctdb_cmdline_client to print to
stderr instead of stdout

(This used to be ctdb commit 6e6e165c2d8f0963ce37567c23aaa012fc3e89d9)
2007-08-07 12:51:25 +10:00
Ronnie Sahlberg
2b51871bad add a ctdb command to print the default public ip of a host.
(This used to be ctdb commit 7de5489f6ebd0e5671e7afa5cb51471043ee46d1)
2007-08-07 12:10:05 +10:00
Ronnie Sahlberg
fca90ce3c3 updated ctdb tickle management
there is an array for each node/public address that contains tcp tickles

we send a TCP_ADD as a broadcast to all nodes when a client is added

if tcp tickles are removed, they are only removed immediately from the 
local node.
once every 20 seconds a node will push/broadcast out the tickle list for 
all public addresses it manages.   this will remove any deleted tickles 
from the remote nodes

(This used to be ctdb commit e3c432a915222e1392d91835bc7a73a96ab61ac9)
2007-07-20 15:05:55 +10:00
Ronnie Sahlberg
a650497680 as an optimization for when we want to send multiple tickles at a time
let the caller create the sending socket and use a single socket instead 
of one new one for each tickle.
pass a sending socket to ctdb_sys_send_tcp()

ctdb_sys_kill_tcp is not longer used so remove it

set the socketflags for close on exec and nonblocking in the helper that 
creates the sockets instead of in the caller

add a helper to create a sending socket to send tickles from

(This used to be ctdb commit 469f3fb238a0674a2b48fdf1a7e657e32428178a)
2007-07-12 09:22:06 +10:00
Ronnie Sahlberg
76ab80104a make the ctdb tool use the killtcp control in the daemon instead of
calling killtcp directly

(This used to be ctdb commit d21e3e9cf11bdcba6234302e033d6549c557dd69)
2007-07-12 08:30:04 +10:00
Andrew Tridgell
32de198fd3 update lib/replace from samba4
(This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)
2007-07-10 15:29:31 +10:00
Ronnie Sahlberg
d81bca2072 make it possible to specify how many times ctdb killtcp will try to RST
the tcp connection

change the 60.nfs script to run ctdb killtcp in the foreground so we 
dont get lots of these running in parallel when there are a lot of tcp 
connections to rst

(This used to be ctdb commit d81616214752882242f2886e94681972a790db80)
2007-07-10 10:24:20 +10:00
Andrew Tridgell
871ef93b82 fixed help layout
(This used to be ctdb commit ee8acf166961838a3a82d658a66407ba5ccb4939)
2007-07-05 10:00:51 +10:00
Andrew Tridgell
3b4fa64dd9 fixed error message on bad IP/port
(This used to be ctdb commit ad2d8615c028d55bc5e94c9d7bd8432cafde4a69)
2007-07-05 09:59:45 +10:00
Ronnie Sahlberg
71ba917444 add a command to ctdb to send tickle-ack's
(This used to be ctdb commit 83ddb6eaa269fbc5f235d606ee21239a7e0e23d2)
2007-07-05 08:56:02 +10:00
Andrew Tridgell
bdf01ed7c0 - neaten up the command line for killtcp
- split out the event script code into a separate module
- get rid of the separate takeover directory

(This used to be ctdb commit 8ea2c923a3e2464200ff79bf2c3f1f89e6a93ad4)
2007-07-04 16:51:13 +10:00
Ronnie Sahlberg
5ad7f642f3 we dont need socketkiller anymore now that the
kill-tcp-connection code is available from the ctdb tool

(This used to be ctdb commit c24890ad44b535c989bd21e83d619a1bd4825834)
2007-07-04 14:16:28 +10:00
Ronnie Sahlberg
ab6564c83d add a killtcp command to the ctdb tool
(This used to be ctdb commit 01987b51fed0dc0b9a5e254fa734bdeb19debf6f)
2007-07-04 14:14:48 +10:00
Ronnie Sahlberg
edcab7e068 ETH_P_IP does not work on my ubuntu system so changing it back to the
slightly less efficient ETH_P_ALL

(This used to be ctdb commit 84b8c77654b6c24928f63c801b183390824a3f6f)
2007-07-04 13:27:08 +10:00
Andrew Tridgell
2014d3959f merge from ronnie
(This used to be ctdb commit b5510446073d6a058d11dabf92bef0e9721cd861)
2007-07-04 13:14:45 +10:00
Ronnie Sahlberg
597aa7ed59 initial version of a socketkiller tool
checked in so it is not lost 

this tool takes a socketpair as arguments and will reset the tcp 
connection

(This used to be ctdb commit bddd448740ef7f5a88b8549a3d184a94ac9fcd96)
2007-07-04 12:52:07 +10:00
Andrew Tridgell
1ac8a52891 simpler handling of -n all in ctdb tool
(This used to be ctdb commit 68c7c33c2863d4073e5129b24eb79454643dc65f)
2007-06-11 22:25:26 +10:00
Ronnie Sahlberg
6613396ad5 update the blurb for the setmonmode control it takes 0 or 1 as a
parameter depending on whether one wants to disabel or enable monitoring

(This used to be ctdb commit 849a1cce6cc3e145925dd4a8a38b2698be0ce8d5)
2007-06-09 07:54:37 +10:00
Andrew Tridgell
06a71762a4 some #include cleanups
(This used to be ctdb commit 1a07d87122d51a40cd8ad5fe13533298c26857cb)
2007-06-07 22:26:27 +10:00
Andrew Tridgell
b50096c835 more code rearrangement
(This used to be ctdb commit 2bcf3b16163041f03add2e5bf9f1f5fb3599ec24)
2007-06-07 22:16:48 +10:00
Andrew Tridgell
96861466b7 there are now far too many controls for the controls statistics fields to be useful
(This used to be ctdb commit f5e188fc7e13b55b6b4081dcc74ea9614a76f9bb)
2007-06-07 18:07:38 +10:00
Andrew Tridgell
cb4c33cc68 handle CTDB_CURRENT_NODE in ban commands
(This used to be ctdb commit fefb53f1d22c5458a1e107f8352818aee87983de)
2007-06-07 16:48:31 +10:00
Andrew Tridgell
23bf62fe30 added admin commands to ban/unban nodes
(This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad)
2007-06-07 16:34:33 +10:00
Andrew Tridgell
2ed57a9ae1 implement a scheme where nodes are banned if they continuously caused the cluster
to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes)

(This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c)
2007-06-07 15:18:55 +10:00
Andrew Tridgell
9754d16d48 merged admin enable/disable change from ronnie
(This used to be ctdb commit df17b69dfd83a98f9c711994c7dd51ad2cc0ab8a)
2007-06-07 11:15:22 +10:00
Ronnie Sahlberg
d93c6f8db2 show the disabled/permanently disabled status in the machinereadble
output for 'ctdb status'

(This used to be ctdb commit a9e920a492e1e91d205ee8b9cd704a7cf85c1e01)
2007-06-07 09:27:51 +10:00
Ronnie Sahlberg
9ff733c784 add a control to permanently enable/disable a node
(This used to be ctdb commit d66fdba16ca22f62ddac6882a17614879b08a798)
2007-06-07 09:16:17 +10:00
Andrew Tridgell
341d715f1a formatting fix for wider variable names
(This used to be ctdb commit 195bde145f62221a7bb1b12014ada98ad5df4e9e)
2007-06-06 22:17:46 +10:00
Andrew Tridgell
b130540102 merged vsftpd event script from ronnie
(This used to be ctdb commit c0b686c43524c6a93c52d85b0079ed820983133e)
2007-06-06 10:29:27 +10:00
Andrew Tridgell
af8834dd02 added health monitoring logic to ctdb, so a node loses its public IP address if one of the sybsystem event scripts reports a problem
(This used to be ctdb commit c7a089256d86cec21097453bce5acbccee87413f)
2007-06-06 10:25:46 +10:00
Ronnie Sahlberg
91a97fea03 provide machinereadable output for ctdb ip
(This used to be ctdb commit 86348de0bfdc4f91ff6f5a8eeff06044d512ee43)
2007-06-05 18:32:06 +10:00
Andrew Tridgell
ac55bc4166 first step in health monitoring of cluster nodes. When not healthy they will be marked disabled
(This used to be ctdb commit d3dbd9fc4db21632075b56fc52cf95435c63374a)
2007-06-05 17:43:19 +10:00
Ronnie Sahlberg
4be9a44ba7 add a control that lists all public ip addresses and which node that
currently serves it

(This used to be ctdb commit db9b89dc423b31079e5502323e5fd2bbaf82e1e9)
2007-06-04 21:11:51 +10:00
Ronnie Sahlberg
1ee8989bd4 merge from tridge
(This used to be ctdb commit 3bfede5d46dba5a3654dad9205534391bc339461)
2007-06-04 20:10:53 +10:00
Andrew Tridgell
dbb2ec43dd added tunables settable using ctdb command line tool
(This used to be ctdb commit 73d440f8cb19373cfad7a2f0f0ca4f963c57ff29)
2007-06-04 19:53:19 +10:00
Andrew Tridgell
f1d81386e6 - start moving tunable variables into their own structure
- fixed the test scripts to use a separate dbdir

(This used to be ctdb commit 396752e8908c48373564e915e2d49cfc9ff61eba)
2007-06-04 17:46:37 +10:00
Andrew Tridgell
a57991c0eb remove some cruft thats not needed any more
(This used to be ctdb commit c4308805b997740b77e058c1a14b84cb400a7c30)
2007-06-04 17:23:55 +10:00
Ronnie Sahlberg
464ed12991 merge from tridge
(This used to be ctdb commit 948b449748a126386f49ef9e763cfffd8b651516)
2007-06-04 15:44:13 +10:00
Andrew Tridgell
cc9f6d30d8 split out the basic interface handling, and run event scripts in a deterministic order
(This used to be ctdb commit 399e993a4a233a5953e1e7264141e5c7c8c8c711)
2007-06-04 15:09:03 +10:00
Ronnie Sahlberg
8a53a6aa29 show the second column in the machinereadable output for ctdb status as
IP

(This used to be ctdb commit 9ee38e8cfc4b602f6769549a83a1302138e055a1)
2007-06-04 13:31:58 +10:00
Ronnie Sahlberg
a3e4e204dc add the ip address to the nodemap structure we pull from a server and
display the physical address of a node when we do a ctdb status

(This used to be ctdb commit 660bf30db713f0680acd3f74275ad603b35a0c24)
2007-06-04 13:26:07 +10:00
Ronnie Sahlberg
5dde7e27e0 add a -Y option to generate machine readable output.
print 'ctdb status' in machinereadable form as
:VNN:0|1:

(This used to be ctdb commit 1aa6a632ec59d854fc5579fedad0d66b1b46ae8c)
2007-06-03 19:50:51 +10:00
Andrew Tridgell
794d6dd59d move config files to config/ directory
(This used to be ctdb commit f95de519b885c8e1f40df0cda70fd796e479a22a)
2007-06-02 19:40:07 +10:00
Andrew Tridgell
2f5af51c53 add an easy way to setup ctdb to start/stop samba
(This used to be ctdb commit b0d9f427d83aff5b9a5c54b7b7c9d45d418e2352)
2007-06-02 18:51:05 +10:00
Ronnie Sahlberg
ebe34b4353 update the evens scripts for nfs and nfslock to honour CTDB_MANAGES_NFS
which is set in /etc/sysconfig/nfs

(This used to be ctdb commit bf475269231a6129f88b37f4da69e06efcf4ed77)
2007-06-02 16:44:15 +10:00
Ronnie Sahlberg
5dc243ff93 STATD_SHARED_DIRECTORY should be define din the nfs sysconfig file and
not the ctdb sysconfig file since this variable has nothing at all to do 
with ctdb

(This used to be ctdb commit d17073b7da5ecba1b93a5ed4fbdf86bf052fdc90)
2007-06-02 16:33:17 +10:00
Andrew Tridgell
ebf12646cf - make specification of a recovery lock file compulsory
- die if someone other than the recmaster can get the recovery lock

(This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869)
2007-06-02 11:36:42 +10:00
Andrew Tridgell
3a0395dffd added nfs event script
(This used to be ctdb commit a708a635a1be355d2e8d382166f58f65f669c8ea)
2007-06-02 00:10:22 +10:00
Andrew Tridgell
dff9a6ecd1 make the packaging much more portable - tested on SLES9 and RHEL4
(This used to be ctdb commit 9521e3eee42b11303a2d6e0f5c05d0c0de4292d8)
2007-06-01 23:25:33 +10:00
Andrew Tridgell
1fa2600c8b - make symlink relative in install
- include ctdb functions in samba and nfslock event scripts

(This used to be ctdb commit 08e2278069346b1fc49774603aa26c68222cf67f)
2007-06-01 21:20:05 +10:00
Andrew Tridgell
b5890ad2c1 split out events for each subsystem separately
(This used to be ctdb commit 03c629a72f234dcc783fa1085e7edba09597c241)
2007-06-01 20:54:26 +10:00
Andrew Tridgell
559a8bd278 use a subdirectory for ctdb state files
(This used to be ctdb commit 71ebf272be42e313715f0f100be9f5567127eb73)
2007-06-01 19:16:58 +10:00
Andrew Tridgell
f5171454b3 log dates/time in event startup messages
(This used to be ctdb commit 60a2f704f2e0544035778d00e91041e09351ed8f)
2007-06-01 15:23:16 +10:00
Andrew Tridgell
95ed6f8725 added CTDB_WAIT_DIRECTORIES support
(This used to be ctdb commit fa888e8b1715d7460f5718d3e1fe17e4caaa15c3)
2007-06-01 13:50:18 +10:00
Ronnie Sahlberg
86d0fc8e4f it is -f not -x to check if a file exists
(This used to be ctdb commit 52457d5e811f91c051ce0fa32739667a1d21862a)
2007-06-01 13:26:14 +10:00
Ronnie Sahlberg
425b3c56c6 - create /etc/ctdb/taken_ips and /etc/ctdb/changed_ips analog to the
existing /etc/ctdb/released_ips

- only call the statd-callout script if the ips have changed  and call 
it with a "notify" argument.    we need to restart nfslock service in 
both cases

- change statd-callout to explicitely restart the lock manager and statd 
when "notify" is called.   copy the state directory for each held ip 
from shared storage to /tmp then use sm-notify to send notifications to 
all monitored clients

(This used to be ctdb commit 800f15a27af885a3f83430d3bc411cc72ac40e86)
2007-06-01 13:14:05 +10:00
Andrew Tridgell
bf3b740a1b ctdb is GPL not LGPL
(This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960)
2007-05-31 13:50:53 +10:00
Andrew Tridgell
d86298248f better location for statd-callout
(This used to be ctdb commit cc208c447b732aeeaefd6a889711d3cd83ea128e)
2007-05-31 11:14:07 +10:00
Andrew Tridgell
c6d4478fda added hooks to make nfs statd behave correctly on failover
(This used to be ctdb commit a1ee84fc47892b6c18d417ccf714211fcb07952e)
2007-05-31 11:09:45 +10:00
Andrew Tridgell
d510ce3281 use our own netmask when deciding if we should takeover a IP, not the other nodes
- check if ctdb dies while waiting for the startup event

(This used to be ctdb commit 8b59f73c527a6d0a8abe8030dc3cbbc4329657be)
2007-05-30 16:11:39 +10:00
Andrew Tridgell
3eb96b4553 - nice messages while waiting for tcp services to come up
- added more comments to sysconfig file

(This used to be ctdb commit 9cbe7ad147a73cd6594fa7bcee0544fd986ad8c0)
2007-05-30 12:37:03 +10:00
Andrew Tridgell
b382fac817 wait for local tcp services like smbd to come up before allowing ctdb to start talking to other nodes
(This used to be ctdb commit 04eea084ebf1710ea66ccb03ac661e3b2f58d96f)
2007-05-30 12:27:58 +10:00
Andrew Tridgell
7cd7081beb support ctdb status -n all
(This used to be ctdb commit 8ff2ea29fc60a1e9854bf0c59c360e29f35d3b69)
2007-05-30 11:12:50 +10:00
Andrew Tridgell
229846cdd2 moved onnode into ctdb from s3 examples/ctdb
(This used to be ctdb commit a3fdaebf1a90ff3c2843a592f6c657a8eae42975)
2007-05-30 11:00:43 +10:00
Andrew Tridgell
5747a5a358 auto-restart NFS if its running when we release an IP
(This used to be ctdb commit 2e1e1e8e34bf4c15decbbc8f0ca88004a2ed67df)
2007-05-30 10:21:16 +10:00
Andrew Tridgell
9891c6b975 flush any local arp entries for the given ip on add/del
(This used to be ctdb commit 814decd66423e955b443f0729ceec581c0d0c0e3)
2007-05-29 19:34:04 +10:00
Andrew Tridgell
3b146e7616 don't block SIGCHLD, or we lose return values from system() !
nicer log messages from events script

(This used to be ctdb commit 5ed2b496675a6a47d7ad87519a97bc4f293e6730)
2007-05-29 17:23:29 +10:00
Andrew Tridgell
2f7fcecb59 fixed shell syntax in events script
(This used to be ctdb commit 629435807e7927a0e1524cd3e4b2aa216a651e2c)
2007-05-29 16:28:18 +10:00
Andrew Tridgell
578b2a585d - make more options configurable
- fixed some warnings

(This used to be ctdb commit e08bb371827b14a80a131ce8e83145cd468e7e1f)
2007-05-29 16:02:02 +10:00
Andrew Tridgell
edcaa0d6a0 clean shutdown in ctdb - release all our IPs
(This used to be ctdb commit 2f196cb6a86eb85205d7de1c4cadd4e1e701c06f)
2007-05-29 13:33:59 +10:00
Andrew Tridgell
1455d7d7ad don't need maskbits to ip addr del
(This used to be ctdb commit 93125b460a44934f30bb995ff3c5365ac5a263d5)
2007-05-29 13:21:37 +10:00
Andrew Tridgell
6cd49d7842 fixed syntax of /sbin/ip
(This used to be ctdb commit 9791901dda000fbef6e520531f39ead575531721)
2007-05-29 13:09:15 +10:00
Andrew Tridgell
1becc9f2e7 made events script executable
(This used to be ctdb commit 54934884ae2bfe8b7d155aa22ee90b2d0a674def)
2007-05-29 13:04:52 +10:00
Andrew Tridgell
a39eff68a8 added an example ctdb event script
(This used to be ctdb commit f97b75497d005306c5f893c3182f1c2a9b4dc6b7)
2007-05-29 13:01:31 +10:00
Andrew Tridgell
ccf4d78e04 - renamed ctdb_control utility to ctdb
- use -n to specify node number in ctdb utility

- change 'ctdb status' to 'ctdb statistics'

- added 'ctdb status' which shows status

- added netmask to public IPs, so you don't try a takeover on a
  foreign network

- cleaned up tools/ctdb_control.c a lot

- generate usage message at runtime

(This used to be ctdb commit 28de71c03ace7d32a9fd9882fabbd5d668b97656)
2007-05-29 12:16:59 +10:00
Andrew Tridgell
922d054bca remove experimental code
(This used to be ctdb commit f1d91002247bedb2f163cc9a9515bbe2bbc2692e)
2007-05-27 16:58:43 +10:00
Andrew Tridgell
957ec5d63a fixed tcp data offset and checksum
(This used to be ctdb commit 2df23e0d3df52b746e9aee8d194ad1da16b62657)
2007-05-27 16:56:12 +10:00
Andrew Tridgell
56e3eed3d1 added IP takeover logic for public IPs to ctdb
(This used to be ctdb commit 374adb729472670f35cef41269b8719f49c0de0e)
2007-05-25 17:04:13 +10:00
Ronnie Sahlberg
2b6c39a0af add controls to take over and release an ip address
add sending of grat arp     both normal grat arp (request) and also
unsolicited grat arp replies

(This used to be ctdb commit 7305c00c21c30bdbafc3722a018513378bd307e6)
2007-05-25 13:05:25 +10:00
Ronnie Sahlberg
2aface246e add a new command for ctdb_control to trigger a recovery
(This used to be ctdb commit 6da2a4ab1b9c955d55a1c6817506a74539623892)
2007-05-24 08:08:45 +10:00
Andrew Tridgell
296e15c9d4 fixed some memory leaks on the traverse code
(This used to be ctdb commit 2781cbb7d00c5448449216c8c0c1b37bdc74a6c0)
2007-05-23 20:06:37 +10:00
Ronnie Sahlberg
e989a1bac8 add controls to enable/disable the monitoring of dead nodes
(This used to be ctdb commit 79d29c39bb81feb069db3fc6d3d392c1e75a4d13)
2007-05-21 09:24:34 +10:00
Andrew Tridgell
ab66fb840e removed obsolete ctdb_dump tool
(This used to be ctdb commit e3ed6fd65896f07fc76405acb2e16f50f04a0a3c)
2007-05-19 14:07:01 +10:00
Ronnie Sahlberg
db4c479568 add dead node detection so that if a node does not generate any
keepalive traffic for x seconds   it is deemed dead


this triggers a recovery after a while if a ctdbd has been STOPPED    
but it doesnt recover automatically when the node reappears

(This used to be ctdb commit d6324afe0d13b5e21d06e347caca433c6b36a32a)
2007-05-18 19:19:35 +10:00
Ronnie Sahlberg
f4738f9c41 we no longer pass lmaster across during pulldb so dont print it from
catdb either

(This used to be ctdb commit b57d60f4789ea7f0dd69c93f6629d8742e182576)
2007-05-17 12:07:29 +10:00
Ronnie Sahlberg
cc760cf13a add a control to shutdown/kill a node
(This used to be ctdb commit 3802f7304fd59d56062c855987e2561753e85a69)
2007-05-17 10:45:31 +10:00
Andrew Tridgell
a5198559c9 moved the recovery daemon into the main ctdbd and enable it by default
(This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0)
2007-05-15 15:13:36 +10:00
Andrew Tridgell
81826da2df added error messages in ctdb_control replies
(This used to be ctdb commit bd848f5b760e6b2a73ebfc67fd8adb3c31479fb5)
2007-05-12 21:25:26 +10:00
Andrew Tridgell
5bd0e50086 added -t option to ctdb_control
(This used to be ctdb commit 658141280eeb121a570d71c4b0af36d03004f320)
2007-05-12 16:04:56 +10:00
Andrew Tridgell
2c90d9e794 show total frozen/recoving in status
(This used to be ctdb commit 0d0eb66a63fe6912edb85bf7387ac76acb70babd)
2007-05-12 15:51:08 +10:00
Andrew Tridgell
b327cd872d report number of frozen/thawed nodes
(This used to be ctdb commit 997720bc0e15d882aefed3464fe285674beed691)
2007-05-12 15:44:56 +10:00
Andrew Tridgell
9cf77dd23f separate out the freeze/thaw handling from recovery
(This used to be ctdb commit 0b0640bd8b8334961f240e0cf276ac112cd6e616)
2007-05-12 15:15:27 +10:00
Andrew Tridgell
f8765b19bf - got rid of the complex hand marshalling in the recovery controls
- fixed the re-send of ctdb calls after a generation change

- fixed a reqid idr leak in controls

- removed the write_record test code

- use the new nonblock lockall code to prevent ctdbd from ever doing a
  blocking lock that could deadlock with smbd

- moved more of the recovery controls into ctdb_recover.c

(This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec)
2007-05-10 17:43:45 +10:00
Andrew Tridgell
15bc97cdaa better timeout handling for calls, controls and traverses
(This used to be ctdb commit 63346a6c59d4821b4c443939b5d88db8cd20f5fe)
2007-05-10 14:06:48 +10:00
Andrew Tridgell
31cd92dc7e merge from ronnie
(This used to be ctdb commit 92b7a849565730744c75a7fb776173554e9f57bf)
2007-05-10 13:15:58 +10:00
Ronnie Sahlberg
82e37a9886 update ctdb_control to create a correct ctdb_vnn_map->map array
(This used to be ctdb commit e510cc89068557881688d6cada38915b3e51f8cd)
2007-05-10 10:03:21 +10:00
Andrew Tridgell
1e38ae491f remove old s3 recovery code
fixed vnnmap wire format in recover daemon

(This used to be ctdb commit e03fab7bfe0cf43f40c49a3d63e75dc44001d8d8)
2007-05-10 08:49:57 +10:00
Ronnie Sahlberg
2befe18e29 add a small tool to monitor recovery
(This used to be ctdb commit b45936828713c31ee670e2106b49c2351234f310)
2007-05-09 08:05:53 +10:00
Ronnie Sahlberg
39d81cffb1 recovery daemon with recovery master election
election is primitive, it elects the lowest vnn as the recovery master

two new controls, to get/set recovery master for a node



to use recovery daemon,   start one  
./bin/recoverd --socket=ctdb.socket*
for each ctdb daemon


it has been briefly tested by deleting and adding nodes to a 4 node 
cluster but needs more testing

(This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3)
2007-05-07 06:51:58 +10:00
Ronnie Sahlberg
7bbcc964f2 add support in catdb to dump the content of a specific nodes tdb instead
of traversing the full cluster.
this makes it easier to debug recovery

update the test script for recovery to reflect the newish signatures to
ctdb_control



the catdb control does still segfault however when there are missing 
nodes in the cluster   as there are toward the end of the recovery test

(This used to be ctdb commit 8de2a97c14a444f817ceb36461314f10c9601ecc)
2007-05-06 05:53:15 +10:00
Ronnie Sahlberg
25edbc9a50 add a control to get the pid of a daemon.
this makes it possible to kill a specific daemon in the recover test 
script

(This used to be ctdb commit 2fa394b4c80988cb1a6d04b236ec64cc9d9e8a40)
2007-05-06 04:31:22 +10:00
Ronnie Sahlberg
2e64727079 merge from tridge
(This used to be ctdb commit 8648104f8d76d22427c14422b126f7e979cc2d95)
2007-05-05 16:51:34 +10:00
Andrew Tridgell
9636c97c5a show number of connected clients in status output
(This used to be ctdb commit 99765bbe327bfe9c43415f4943281458f25be51b)
2007-05-05 14:09:46 +10:00
Ronnie Sahlberg
5cb817f031 split the vnn broadcast address into two
one broadcast address for all nodes
and one broadcast address for all nodes in the current vnnmap

update all useage of the old flag to now only broadcast to the vnnmap
except for tools/ctdb_control where it makes more sense to broadcast to 
all nodes

(This used to be ctdb commit dfb65b88cf67ad9d61268c4b47a6d8ae346f47df)
2007-05-05 13:17:26 +10:00
Andrew Tridgell
410d41480a added a dumpmemory control, used to find memory leaks
(This used to be ctdb commit 44fdafaf421e3e906796d529aed2f7c5df201b94)
2007-05-05 11:03:10 +10:00
Andrew Tridgell
adc64aed0a - fixed a crash bug after client disconnect in ctdb_control
- added total memory used to ctdb_control status output

(This used to be ctdb commit a99ffe4372edc63d83d8c8ebf9a60b3413301f5a)
2007-05-05 08:33:35 +10:00
Andrew Tridgell
d8f4e6b209 - added counters for controls in ctdb_control status
(This used to be ctdb commit 858061372fc9902837a1a5b8bcfc0ada58eec193)
2007-05-05 08:11:54 +10:00
Ronnie Sahlberg
508cafd17e merge from tridge
(This used to be ctdb commit 6c8b90cedc67daa89d54db5268fde18bfc20abaf)
2007-05-04 17:05:28 +10:00
Ronnie Sahlberg
7dfdab1b9d recovery daemon
this program is a client to the local ctdb daemon

every second it pulls all vnnmap and nodemaps from all nodes that are 
available and checks if a recovery is required

a recovery is required if :
* all nodes do NOT have an identical vnnmap and generation
* all nodes do NOT have an identical nodemap
* there are active nodes that are NOT in the nodemap
* there are nodes in the nodemap that are NOT active

During recovery,  the recovery tool will also make sure that all nodes 
know about and have created all databases.

(This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26)
2007-05-04 15:21:40 +10:00
Andrew Tridgell
5c4a477120 make catdb take a dbname instead of an id
(This used to be ctdb commit 365346345c33d2f310bb23d0c6ab5c3ed5e6e938)
2007-05-04 13:25:30 +10:00
Andrew Tridgell
f2fd53056d nicer interface to ctdb traverse
(This used to be ctdb commit e5ce866dcc5037b5069e42bf1e168b646f007b01)
2007-05-04 12:18:39 +10:00
Andrew Tridgell
e752f3bd97 - changed the REQ_REGISTER PDU to be a control
- allow controls to know which client invoked them

- added a client_id to clients, so they can be identified remotely

- added the ability to remove registered srvids

- in the list_keys code, register a temp srvid, then remove it afterwards

(This used to be ctdb commit 29603c51cc6d81362532cd8e50f75c8360c5f5ef)
2007-05-04 11:41:29 +10:00
Ronnie Sahlberg
2b1714a521 update getvnnmap control to take a timeout parameter
dont explicitely free the vnnmap pointer in the getvnnmap control  this 
is freed by the mem_ctx instead

add code to the recoverd to detect when/if recovery is required
veiry that the number of active nodes, the nodemap and the vnn map is 
consistent across the entire cluster and if not   trigger a recovery 
(which right now just prints "we need to do recovery" to the screen.

(This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5)
2007-05-04 09:45:53 +10:00
Ronnie Sahlberg
ae73784c28 change the signature for ctdb_ctrl_getnodemap() so that a timeout
parameter is added.
change ctdb_get_connected_nodes in the same way

(This used to be ctdb commit d85f23bcf4c1230225abb2f4a053c70b68d469aa)
2007-05-04 09:01:01 +10:00
Ronnie Sahlberg
ebc478749b start working on a recovery daemon
change ctdb_control so it takes a timeval pointer as argument.
this is the timeout. if the node has not responded within hte timeout
ctdb_control will return an error instead of hanging.
if the timeval pointer is NULL then the call will block indefinitely if 
there is no response.

this is used for now in the createdb control   but all the helpers 
ctdb_ctrl_* should probably be updated to take a timeout parameter as 
well.

(This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211)
2007-05-04 08:30:18 +10:00
Andrew Tridgell
486c6b4fce merged from ronnie
(This used to be ctdb commit 57a80110ddfd202f8de37297db76dc43a064e476)
2007-05-03 13:53:54 +10:00
Ronnie Sahlberg
d88154b24a cleanup getnodemap
(This used to be ctdb commit 3867ccf71a167fb82dbc5a3f03f968a325a0c70b)
2007-05-03 13:30:38 +10:00
Ronnie Sahlberg
633ae7f346 fixup getdbmap control so it looks a bit nicer
(This used to be ctdb commit 78a4d61cb78da20af5210488e685c91bc3023e90)
2007-05-03 13:07:34 +10:00
Andrew Tridgell
472b96d6d3 first stage of efficient non-blocking ctdb traverse
(This used to be ctdb commit 4c23e6f26bde421bb56b55de9d6cd3e319b2be40)
2007-05-03 12:16:03 +10:00
Ronnie Sahlberg
27880056db break set/get vnn map out from ctdb_control and put it in ctdb_recover.c
for the time being

remove all the [de]marshalling and just pass a structure around instead

(This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c)
2007-05-03 11:06:24 +10:00
Ronnie Sahlberg
768eb0f763 merge from tridge
(This used to be ctdb commit 17b73a811009588f836c3f9fd1b775d9d504d30c)
2007-05-02 22:00:48 +10:00
Ronnie Sahlberg
206fb1fd3b add a recover test change alignment for the pull/push db structures
(This used to be ctdb commit 0eb45623ca103e69765ed577ae02e7f8ca777e37)
2007-05-02 21:00:02 +10:00
Andrew Tridgell
317ad52758 added a builtin fetch function to support samba3 unlocked fetch
(This used to be ctdb commit 8c57a8355a94a7d714b9bec98533bc40a2bc4684)
2007-05-02 15:11:11 +10:00
Andrew Tridgell
2767ebb8df nicer command parsing in ctdb_control
(This used to be ctdb commit 440077ffabb4ce831004b36ac26bd2f8f9b41499)
2007-05-02 13:34:55 +10:00
Andrew Tridgell
353762fd18 nicer string handling in usage
(This used to be ctdb commit 8c568ada9b46132ebfa452def4f8ba3f11240532)
2007-05-02 13:29:03 +10:00
Ronnie Sahlberg
d20990c2b6 add a control to create a database
(This used to be ctdb commit 74e489c6737cc79537c7812ea82daafb1b363ec2)
2007-05-02 12:43:35 +10:00
Ronnie Sahlberg
9d20a09631 change the getnodemap control to a more consistent output for whether a
node is connected or not

(This used to be ctdb commit 65c5fe53937a17e1fa6de5739cbd01b982dc49bb)
2007-05-02 11:06:58 +10:00
Ronnie Sahlberg
3a891c6676 merge with tridges tree to resolve all conflicts
(This used to be ctdb commit 0f7c6c580ef0de60af68fd22bce36c0c0b2515b0)
2007-05-02 10:53:29 +10:00
Ronnie Sahlberg
92f5daf252 specify which node to perform recovery to when using the recovery
control

(This used to be ctdb commit c67f8a1783ce6f5af9940d2e22847ddcd939763d)
2007-05-02 10:37:43 +10:00
Ronnie Sahlberg
51630f9b12 add an initial recovery control to perform samba3 style recovery
this is not optimized at all and copies/merges all records between 
databases instead of only those records for which a certain node is 
lmaster.  (step 7 should later be enhanced to a, delete the database, 
push only those records for which the node is lmaster)

(This used to be ctdb commit 509d2c71169e96a8610f9db91293dc7a73c2cc10)
2007-05-02 10:20:34 +10:00
Andrew Tridgell
2dc24c7d56 added a hopcount in ctdb_call
(This used to be ctdb commit 36d838801a2a2008c50322cdbfff65a308b1cd1a)
2007-05-01 13:25:02 +10:00
Andrew Tridgell
bbf358cfcf added attach command in ctdb_control
(This used to be ctdb commit 172ee33306be2ef5ce17a5b9d7fbcc1f265a1b0b)
2007-04-30 15:54:06 +02:00
Ronnie Sahlberg
eacfcaf437 add push/pull of tdb and a control to copy a tdb from one node to
another node

(This used to be ctdb commit c313daff4c1362cd08a9f682ce04cab312678038)
2007-04-30 00:58:27 +10:00
Ronnie Sahlberg
f67a79ad8e merge from tridge
(This used to be ctdb commit a84e9b47a87fc7d4756b4a179aa2ea0bc7c54c78)
2007-04-29 23:49:27 +10:00
Ronnie Sahlberg
77ce5750b2 add a new "recovery mode" field to ctdb.
while recovery is in progress  the daemon will discard all CTDB_REQ_CALL 
and rely on clients retransmitting them

add new controls to get/set the recovery mode

(This used to be ctdb commit 41458a61577885ac49150f830e92e93e634c5411)
2007-04-29 22:51:56 +10:00
Ronnie Sahlberg
1af701291f implement a control to pull a database from a remote node
it does not yet work since ctdb_control can right now only be called 
from client context and the pull is implemented as the target ctdb node 
itself using a get_keys to pull the keys from the source node   thus 
ctdb daemon needs to ctdb_control to a remote node

(This used to be ctdb commit a55c7c64b4ff87f54b90649c9f469b1ff36dc9ea)
2007-04-29 22:14:51 +10:00
Ronnie Sahlberg
376a3ea852 control to delete all records in a database
(This used to be ctdb commit 6664e00fc02e1c60cc1a35ecd15f4893a34f23d1)
2007-04-29 18:48:46 +10:00
Ronnie Sahlberg
c0b0b4a0f5 add a new control to set all records in a database to a new dmaster
(This used to be ctdb commit fd0d2385206b0329b74d908f3bdf89d3f32095d1)
2007-04-29 18:34:11 +10:00
Ronnie Sahlberg
097037a055 add a control to read an entire tdb from a node including
key/lmaster/header and data

(This used to be ctdb commit ac00d6271ba6356c1edf804df44d0d2600791610)
2007-04-29 05:47:13 +10:00
Andrew Tridgell
10910f52eb added reset status control
(This used to be ctdb commit ec342b667a085a5c740fbeec8882070571071862)
2007-04-28 19:13:36 +02:00
Andrew Tridgell
6e09bfdaf9 much simpler redirect logic
(This used to be ctdb commit 95db9afa7dd039e1700e2f3078782f6ac66e9f51)
2007-04-28 18:18:33 +02:00
Andrew Tridgell
1e538be42d better name for this hack
(This used to be ctdb commit e5a98eee991a7926ddb6964ea3785b11303d175e)
2007-04-28 17:46:37 +02:00
Andrew Tridgell
c885b159f4 use ctdb_get_connected_nodes for node listing
(This used to be ctdb commit b4efdd1944207e51dccd6cd5e50f451a7dddcd91)
2007-04-28 17:42:40 +02:00
Andrew Tridgell
4b6d00974d added status all and debug all control operations
(This used to be ctdb commit 7f902f6c4270adc0543096c78415d335b88d6232)
2007-04-28 17:13:30 +02:00
Andrew Tridgell
e6d5848a20 report number of clients in ping
(This used to be ctdb commit 9deaa1892faa8288cad9f6fde20d2aa8ba8af699)
2007-04-28 15:15:21 +02:00
Ronnie Sahlberg
d81b306b93 merge with tridge
fix the logic in ctdb_connected to print CONNECTED if the node is 
connected and UNAVAILABLE when the node is dead  instead of the opposite

(This used to be ctdb commit 0f431d2f3e0bd94d10fe77e56cf0ed6c48402400)
2007-04-28 23:11:23 +10:00
Ronnie Sahlberg
4381fb07c1 print vnn as decimal instead of hex
(This used to be ctdb commit 89512fd659c5b1dc450b7162ca985a7083fd40ac)
2007-04-28 20:42:42 +10:00
Ronnie Sahlberg
acb4bc095b add a few more controls that are useful for debugging a cluster
(This used to be ctdb commit 751c1365ab55a217ff33d985d52bd26581578617)
2007-04-28 20:40:26 +10:00
Ronnie Sahlberg
643bfe83d3 add a control to pull the database list from a remote node
(This used to be ctdb commit d130e02936ea4bdcd3a6f02c53be4b7771993138)
2007-04-28 20:00:50 +10:00
Andrew Tridgell
22546add19 debug level controls
(This used to be ctdb commit 85f883c081dd1ab069420d2e7f4f2e9d708b3cde)
2007-04-27 15:14:36 +02:00
Ronnie Sahlberg
d4c54a93a0 add a new control : SETVNNMAP to set the generation id and also the vnn
map on a ctdbd daemon

(This used to be ctdb commit f55707885f7b233ad6ddfc952d08851577063200)
2007-04-27 22:08:12 +10:00
Ronnie Sahlberg
d9edf88ae5 add a control to read the vnnmap configuration from a node
add support in ctdb_control to fetch this information from a node

(This used to be ctdb commit 8d7f26c8d78d30c3ccb15a28ddea940d8666e052)
2007-04-27 20:56:10 +10:00
Andrew Tridgell
afa0876335 added a ctdb_get_config call
added a ctdb ping control

(This used to be ctdb commit 7d17378b6e6076a922cffe98239e20dfbbae3bf7)
2007-04-26 19:27:07 +02:00
Andrew Tridgell
8ae14b4052 moved status to ctdb_control
(This used to be ctdb commit 9a543968ba0379fbf8e977e184f22f4349d6243f)
2007-04-26 14:51:41 +02:00
Andrew Tridgell
d955485e7b added a ctdb control message, and tool
(This used to be ctdb commit 0d7a71f35bb8ce95231f8ca1e8e3e4024fe657e5)
2007-04-26 14:27:49 +02:00
Andrew Tridgell
3964d36c91 add version printout
(This used to be ctdb commit 54aaf64cf0681329cdcdc4b7f76e1335952bb683)
2007-04-24 15:17:50 +02:00
Andrew Tridgell
0ba189d423 fit some more windows across a screen
(This used to be ctdb commit f787f9c966bab82065b979b0a6fcc5c056c7cee4)
2007-04-24 14:24:34 +02:00
Andrew Tridgell
f651581460 added max_redirect_count status field
(This used to be ctdb commit ecea04741fe552aa409ab165d7c69ead9649986c)
2007-04-22 18:57:22 +02:00
Andrew Tridgell
1349f0bd49 mark authoritative records
(This used to be ctdb commit f2076338221c5cb28f9045ce5345cc6a9b429f1a)
2007-04-22 16:53:09 +02:00
Andrew Tridgell
f9bfd8a081 debug changes
(This used to be ctdb commit 3ddc1e4f1d3660d33cc2a07e53b66772116e9640)
2007-04-22 16:39:55 +02:00
Andrew Tridgell
107d91e391 - when handling a record migration in the lmaster, bypass the usual
dmaster request stage, and instead directly send a dmaster
  reply. This avoids a race condition where a new call comes in for
  the same record while processing the dmaster request

- don't keep any redirect records during a ctdb call.  This prevents a
  memory leak in case of a redirect storm

(This used to be ctdb commit 59889ca0fd606c7d2156839383a09dfc5a2e4853)
2007-04-22 14:26:45 +02:00
Andrew Tridgell
2a08818e24 added a useful tool for dumping a ctdb
(This used to be ctdb commit 671ed94011e21396571a3f4a5191b9a83911c952)
2007-04-22 09:24:27 +02:00
Andrew Tridgell
e9d43f5e43 - expanded status to include count of each call type
- added lockwait latency

(This used to be ctdb commit 0b5d196147e644cf8b172cb4b593fd46b1caa386)
2007-04-20 21:02:53 +10:00
Andrew Tridgell
2e5aae04de added ctdb_status tool
(This used to be ctdb commit 908d6c6a936e21f70f05827ce302e966cca0132a)
2007-04-20 20:07:47 +10:00