1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-11 05:18:09 +03:00
Commit Graph

922 Commits

Author SHA1 Message Date
Andrew Tridgell
f8765b19bf - got rid of the complex hand marshalling in the recovery controls
- fixed the re-send of ctdb calls after a generation change

- fixed a reqid idr leak in controls

- removed the write_record test code

- use the new nonblock lockall code to prevent ctdbd from ever doing a
  blocking lock that could deadlock with smbd

- moved more of the recovery controls into ctdb_recover.c

(This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec)
2007-05-10 17:43:45 +10:00
Andrew Tridgell
15bc97cdaa better timeout handling for calls, controls and traverses
(This used to be ctdb commit 63346a6c59d4821b4c443939b5d88db8cd20f5fe)
2007-05-10 14:06:48 +10:00
Andrew Tridgell
31cd92dc7e merge from ronnie
(This used to be ctdb commit 92b7a849565730744c75a7fb776173554e9f57bf)
2007-05-10 13:15:58 +10:00
Ronnie Sahlberg
82e37a9886 update ctdb_control to create a correct ctdb_vnn_map->map array
(This used to be ctdb commit e510cc89068557881688d6cada38915b3e51f8cd)
2007-05-10 10:03:21 +10:00
Andrew Tridgell
1e38ae491f remove old s3 recovery code
fixed vnnmap wire format in recover daemon

(This used to be ctdb commit e03fab7bfe0cf43f40c49a3d63e75dc44001d8d8)
2007-05-10 08:49:57 +10:00
Ronnie Sahlberg
2befe18e29 add a small tool to monitor recovery
(This used to be ctdb commit b45936828713c31ee670e2106b49c2351234f310)
2007-05-09 08:05:53 +10:00
Ronnie Sahlberg
39d81cffb1 recovery daemon with recovery master election
election is primitive, it elects the lowest vnn as the recovery master

two new controls, to get/set recovery master for a node



to use recovery daemon,   start one  
./bin/recoverd --socket=ctdb.socket*
for each ctdb daemon


it has been briefly tested by deleting and adding nodes to a 4 node 
cluster but needs more testing

(This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3)
2007-05-07 06:51:58 +10:00
Ronnie Sahlberg
7bbcc964f2 add support in catdb to dump the content of a specific nodes tdb instead
of traversing the full cluster.
this makes it easier to debug recovery

update the test script for recovery to reflect the newish signatures to
ctdb_control



the catdb control does still segfault however when there are missing 
nodes in the cluster   as there are toward the end of the recovery test

(This used to be ctdb commit 8de2a97c14a444f817ceb36461314f10c9601ecc)
2007-05-06 05:53:15 +10:00
Ronnie Sahlberg
25edbc9a50 add a control to get the pid of a daemon.
this makes it possible to kill a specific daemon in the recover test 
script

(This used to be ctdb commit 2fa394b4c80988cb1a6d04b236ec64cc9d9e8a40)
2007-05-06 04:31:22 +10:00
Ronnie Sahlberg
2e64727079 merge from tridge
(This used to be ctdb commit 8648104f8d76d22427c14422b126f7e979cc2d95)
2007-05-05 16:51:34 +10:00
Andrew Tridgell
9636c97c5a show number of connected clients in status output
(This used to be ctdb commit 99765bbe327bfe9c43415f4943281458f25be51b)
2007-05-05 14:09:46 +10:00
Ronnie Sahlberg
5cb817f031 split the vnn broadcast address into two
one broadcast address for all nodes
and one broadcast address for all nodes in the current vnnmap

update all useage of the old flag to now only broadcast to the vnnmap
except for tools/ctdb_control where it makes more sense to broadcast to 
all nodes

(This used to be ctdb commit dfb65b88cf67ad9d61268c4b47a6d8ae346f47df)
2007-05-05 13:17:26 +10:00
Andrew Tridgell
410d41480a added a dumpmemory control, used to find memory leaks
(This used to be ctdb commit 44fdafaf421e3e906796d529aed2f7c5df201b94)
2007-05-05 11:03:10 +10:00
Andrew Tridgell
adc64aed0a - fixed a crash bug after client disconnect in ctdb_control
- added total memory used to ctdb_control status output

(This used to be ctdb commit a99ffe4372edc63d83d8c8ebf9a60b3413301f5a)
2007-05-05 08:33:35 +10:00
Andrew Tridgell
d8f4e6b209 - added counters for controls in ctdb_control status
(This used to be ctdb commit 858061372fc9902837a1a5b8bcfc0ada58eec193)
2007-05-05 08:11:54 +10:00
Ronnie Sahlberg
508cafd17e merge from tridge
(This used to be ctdb commit 6c8b90cedc67daa89d54db5268fde18bfc20abaf)
2007-05-04 17:05:28 +10:00
Ronnie Sahlberg
7dfdab1b9d recovery daemon
this program is a client to the local ctdb daemon

every second it pulls all vnnmap and nodemaps from all nodes that are 
available and checks if a recovery is required

a recovery is required if :
* all nodes do NOT have an identical vnnmap and generation
* all nodes do NOT have an identical nodemap
* there are active nodes that are NOT in the nodemap
* there are nodes in the nodemap that are NOT active

During recovery,  the recovery tool will also make sure that all nodes 
know about and have created all databases.

(This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26)
2007-05-04 15:21:40 +10:00
Andrew Tridgell
5c4a477120 make catdb take a dbname instead of an id
(This used to be ctdb commit 365346345c33d2f310bb23d0c6ab5c3ed5e6e938)
2007-05-04 13:25:30 +10:00
Andrew Tridgell
f2fd53056d nicer interface to ctdb traverse
(This used to be ctdb commit e5ce866dcc5037b5069e42bf1e168b646f007b01)
2007-05-04 12:18:39 +10:00
Andrew Tridgell
e752f3bd97 - changed the REQ_REGISTER PDU to be a control
- allow controls to know which client invoked them

- added a client_id to clients, so they can be identified remotely

- added the ability to remove registered srvids

- in the list_keys code, register a temp srvid, then remove it afterwards

(This used to be ctdb commit 29603c51cc6d81362532cd8e50f75c8360c5f5ef)
2007-05-04 11:41:29 +10:00
Ronnie Sahlberg
2b1714a521 update getvnnmap control to take a timeout parameter
dont explicitely free the vnnmap pointer in the getvnnmap control  this 
is freed by the mem_ctx instead

add code to the recoverd to detect when/if recovery is required
veiry that the number of active nodes, the nodemap and the vnn map is 
consistent across the entire cluster and if not   trigger a recovery 
(which right now just prints "we need to do recovery" to the screen.

(This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5)
2007-05-04 09:45:53 +10:00
Ronnie Sahlberg
ae73784c28 change the signature for ctdb_ctrl_getnodemap() so that a timeout
parameter is added.
change ctdb_get_connected_nodes in the same way

(This used to be ctdb commit d85f23bcf4c1230225abb2f4a053c70b68d469aa)
2007-05-04 09:01:01 +10:00
Ronnie Sahlberg
ebc478749b start working on a recovery daemon
change ctdb_control so it takes a timeval pointer as argument.
this is the timeout. if the node has not responded within hte timeout
ctdb_control will return an error instead of hanging.
if the timeval pointer is NULL then the call will block indefinitely if 
there is no response.

this is used for now in the createdb control   but all the helpers 
ctdb_ctrl_* should probably be updated to take a timeout parameter as 
well.

(This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211)
2007-05-04 08:30:18 +10:00
Andrew Tridgell
486c6b4fce merged from ronnie
(This used to be ctdb commit 57a80110ddfd202f8de37297db76dc43a064e476)
2007-05-03 13:53:54 +10:00
Ronnie Sahlberg
d88154b24a cleanup getnodemap
(This used to be ctdb commit 3867ccf71a167fb82dbc5a3f03f968a325a0c70b)
2007-05-03 13:30:38 +10:00
Ronnie Sahlberg
633ae7f346 fixup getdbmap control so it looks a bit nicer
(This used to be ctdb commit 78a4d61cb78da20af5210488e685c91bc3023e90)
2007-05-03 13:07:34 +10:00
Andrew Tridgell
472b96d6d3 first stage of efficient non-blocking ctdb traverse
(This used to be ctdb commit 4c23e6f26bde421bb56b55de9d6cd3e319b2be40)
2007-05-03 12:16:03 +10:00
Ronnie Sahlberg
27880056db break set/get vnn map out from ctdb_control and put it in ctdb_recover.c
for the time being

remove all the [de]marshalling and just pass a structure around instead

(This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c)
2007-05-03 11:06:24 +10:00
Ronnie Sahlberg
768eb0f763 merge from tridge
(This used to be ctdb commit 17b73a811009588f836c3f9fd1b775d9d504d30c)
2007-05-02 22:00:48 +10:00
Ronnie Sahlberg
206fb1fd3b add a recover test change alignment for the pull/push db structures
(This used to be ctdb commit 0eb45623ca103e69765ed577ae02e7f8ca777e37)
2007-05-02 21:00:02 +10:00
Andrew Tridgell
317ad52758 added a builtin fetch function to support samba3 unlocked fetch
(This used to be ctdb commit 8c57a8355a94a7d714b9bec98533bc40a2bc4684)
2007-05-02 15:11:11 +10:00
Andrew Tridgell
2767ebb8df nicer command parsing in ctdb_control
(This used to be ctdb commit 440077ffabb4ce831004b36ac26bd2f8f9b41499)
2007-05-02 13:34:55 +10:00
Andrew Tridgell
353762fd18 nicer string handling in usage
(This used to be ctdb commit 8c568ada9b46132ebfa452def4f8ba3f11240532)
2007-05-02 13:29:03 +10:00
Ronnie Sahlberg
d20990c2b6 add a control to create a database
(This used to be ctdb commit 74e489c6737cc79537c7812ea82daafb1b363ec2)
2007-05-02 12:43:35 +10:00
Ronnie Sahlberg
9d20a09631 change the getnodemap control to a more consistent output for whether a
node is connected or not

(This used to be ctdb commit 65c5fe53937a17e1fa6de5739cbd01b982dc49bb)
2007-05-02 11:06:58 +10:00
Ronnie Sahlberg
3a891c6676 merge with tridges tree to resolve all conflicts
(This used to be ctdb commit 0f7c6c580ef0de60af68fd22bce36c0c0b2515b0)
2007-05-02 10:53:29 +10:00
Ronnie Sahlberg
92f5daf252 specify which node to perform recovery to when using the recovery
control

(This used to be ctdb commit c67f8a1783ce6f5af9940d2e22847ddcd939763d)
2007-05-02 10:37:43 +10:00
Ronnie Sahlberg
51630f9b12 add an initial recovery control to perform samba3 style recovery
this is not optimized at all and copies/merges all records between 
databases instead of only those records for which a certain node is 
lmaster.  (step 7 should later be enhanced to a, delete the database, 
push only those records for which the node is lmaster)

(This used to be ctdb commit 509d2c71169e96a8610f9db91293dc7a73c2cc10)
2007-05-02 10:20:34 +10:00
Andrew Tridgell
2dc24c7d56 added a hopcount in ctdb_call
(This used to be ctdb commit 36d838801a2a2008c50322cdbfff65a308b1cd1a)
2007-05-01 13:25:02 +10:00
Andrew Tridgell
bbf358cfcf added attach command in ctdb_control
(This used to be ctdb commit 172ee33306be2ef5ce17a5b9d7fbcc1f265a1b0b)
2007-04-30 15:54:06 +02:00
Ronnie Sahlberg
eacfcaf437 add push/pull of tdb and a control to copy a tdb from one node to
another node

(This used to be ctdb commit c313daff4c1362cd08a9f682ce04cab312678038)
2007-04-30 00:58:27 +10:00
Ronnie Sahlberg
f67a79ad8e merge from tridge
(This used to be ctdb commit a84e9b47a87fc7d4756b4a179aa2ea0bc7c54c78)
2007-04-29 23:49:27 +10:00
Ronnie Sahlberg
77ce5750b2 add a new "recovery mode" field to ctdb.
while recovery is in progress  the daemon will discard all CTDB_REQ_CALL 
and rely on clients retransmitting them

add new controls to get/set the recovery mode

(This used to be ctdb commit 41458a61577885ac49150f830e92e93e634c5411)
2007-04-29 22:51:56 +10:00
Ronnie Sahlberg
1af701291f implement a control to pull a database from a remote node
it does not yet work since ctdb_control can right now only be called 
from client context and the pull is implemented as the target ctdb node 
itself using a get_keys to pull the keys from the source node   thus 
ctdb daemon needs to ctdb_control to a remote node

(This used to be ctdb commit a55c7c64b4ff87f54b90649c9f469b1ff36dc9ea)
2007-04-29 22:14:51 +10:00
Ronnie Sahlberg
376a3ea852 control to delete all records in a database
(This used to be ctdb commit 6664e00fc02e1c60cc1a35ecd15f4893a34f23d1)
2007-04-29 18:48:46 +10:00
Ronnie Sahlberg
c0b0b4a0f5 add a new control to set all records in a database to a new dmaster
(This used to be ctdb commit fd0d2385206b0329b74d908f3bdf89d3f32095d1)
2007-04-29 18:34:11 +10:00
Ronnie Sahlberg
097037a055 add a control to read an entire tdb from a node including
key/lmaster/header and data

(This used to be ctdb commit ac00d6271ba6356c1edf804df44d0d2600791610)
2007-04-29 05:47:13 +10:00
Andrew Tridgell
10910f52eb added reset status control
(This used to be ctdb commit ec342b667a085a5c740fbeec8882070571071862)
2007-04-28 19:13:36 +02:00
Andrew Tridgell
6e09bfdaf9 much simpler redirect logic
(This used to be ctdb commit 95db9afa7dd039e1700e2f3078782f6ac66e9f51)
2007-04-28 18:18:33 +02:00
Andrew Tridgell
1e538be42d better name for this hack
(This used to be ctdb commit e5a98eee991a7926ddb6964ea3785b11303d175e)
2007-04-28 17:46:37 +02:00
Andrew Tridgell
c885b159f4 use ctdb_get_connected_nodes for node listing
(This used to be ctdb commit b4efdd1944207e51dccd6cd5e50f451a7dddcd91)
2007-04-28 17:42:40 +02:00
Andrew Tridgell
4b6d00974d added status all and debug all control operations
(This used to be ctdb commit 7f902f6c4270adc0543096c78415d335b88d6232)
2007-04-28 17:13:30 +02:00
Andrew Tridgell
e6d5848a20 report number of clients in ping
(This used to be ctdb commit 9deaa1892faa8288cad9f6fde20d2aa8ba8af699)
2007-04-28 15:15:21 +02:00
Ronnie Sahlberg
d81b306b93 merge with tridge
fix the logic in ctdb_connected to print CONNECTED if the node is 
connected and UNAVAILABLE when the node is dead  instead of the opposite

(This used to be ctdb commit 0f431d2f3e0bd94d10fe77e56cf0ed6c48402400)
2007-04-28 23:11:23 +10:00
Ronnie Sahlberg
4381fb07c1 print vnn as decimal instead of hex
(This used to be ctdb commit 89512fd659c5b1dc450b7162ca985a7083fd40ac)
2007-04-28 20:42:42 +10:00
Ronnie Sahlberg
acb4bc095b add a few more controls that are useful for debugging a cluster
(This used to be ctdb commit 751c1365ab55a217ff33d985d52bd26581578617)
2007-04-28 20:40:26 +10:00
Ronnie Sahlberg
643bfe83d3 add a control to pull the database list from a remote node
(This used to be ctdb commit d130e02936ea4bdcd3a6f02c53be4b7771993138)
2007-04-28 20:00:50 +10:00
Andrew Tridgell
22546add19 debug level controls
(This used to be ctdb commit 85f883c081dd1ab069420d2e7f4f2e9d708b3cde)
2007-04-27 15:14:36 +02:00
Ronnie Sahlberg
d4c54a93a0 add a new control : SETVNNMAP to set the generation id and also the vnn
map on a ctdbd daemon

(This used to be ctdb commit f55707885f7b233ad6ddfc952d08851577063200)
2007-04-27 22:08:12 +10:00
Ronnie Sahlberg
d9edf88ae5 add a control to read the vnnmap configuration from a node
add support in ctdb_control to fetch this information from a node

(This used to be ctdb commit 8d7f26c8d78d30c3ccb15a28ddea940d8666e052)
2007-04-27 20:56:10 +10:00
Andrew Tridgell
afa0876335 added a ctdb_get_config call
added a ctdb ping control

(This used to be ctdb commit 7d17378b6e6076a922cffe98239e20dfbbae3bf7)
2007-04-26 19:27:07 +02:00
Andrew Tridgell
8ae14b4052 moved status to ctdb_control
(This used to be ctdb commit 9a543968ba0379fbf8e977e184f22f4349d6243f)
2007-04-26 14:51:41 +02:00
Andrew Tridgell
d955485e7b added a ctdb control message, and tool
(This used to be ctdb commit 0d7a71f35bb8ce95231f8ca1e8e3e4024fe657e5)
2007-04-26 14:27:49 +02:00
Andrew Tridgell
3964d36c91 add version printout
(This used to be ctdb commit 54aaf64cf0681329cdcdc4b7f76e1335952bb683)
2007-04-24 15:17:50 +02:00
Andrew Tridgell
0ba189d423 fit some more windows across a screen
(This used to be ctdb commit f787f9c966bab82065b979b0a6fcc5c056c7cee4)
2007-04-24 14:24:34 +02:00
Andrew Tridgell
f651581460 added max_redirect_count status field
(This used to be ctdb commit ecea04741fe552aa409ab165d7c69ead9649986c)
2007-04-22 18:57:22 +02:00
Andrew Tridgell
1349f0bd49 mark authoritative records
(This used to be ctdb commit f2076338221c5cb28f9045ce5345cc6a9b429f1a)
2007-04-22 16:53:09 +02:00
Andrew Tridgell
f9bfd8a081 debug changes
(This used to be ctdb commit 3ddc1e4f1d3660d33cc2a07e53b66772116e9640)
2007-04-22 16:39:55 +02:00
Andrew Tridgell
107d91e391 - when handling a record migration in the lmaster, bypass the usual
dmaster request stage, and instead directly send a dmaster
  reply. This avoids a race condition where a new call comes in for
  the same record while processing the dmaster request

- don't keep any redirect records during a ctdb call.  This prevents a
  memory leak in case of a redirect storm

(This used to be ctdb commit 59889ca0fd606c7d2156839383a09dfc5a2e4853)
2007-04-22 14:26:45 +02:00
Andrew Tridgell
2a08818e24 added a useful tool for dumping a ctdb
(This used to be ctdb commit 671ed94011e21396571a3f4a5191b9a83911c952)
2007-04-22 09:24:27 +02:00
Andrew Tridgell
e9d43f5e43 - expanded status to include count of each call type
- added lockwait latency

(This used to be ctdb commit 0b5d196147e644cf8b172cb4b593fd46b1caa386)
2007-04-20 21:02:53 +10:00
Andrew Tridgell
2e5aae04de added ctdb_status tool
(This used to be ctdb commit 908d6c6a936e21f70f05827ce302e966cca0132a)
2007-04-20 20:07:47 +10:00