1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-13 13:18:06 +03:00
Commit Graph

266 Commits

Author SHA1 Message Date
Andrew Tridgell
cb81a2eca8 watch for the freeze child exiting
(This used to be ctdb commit 7f350eca8598022ebd198b2476d1f2c2a8f03a8d)
2007-05-12 15:44:35 +10:00
Andrew Tridgell
f7e3004f0a more robust freeze/thaw logic
(This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76)
2007-05-12 15:29:06 +10:00
Andrew Tridgell
9cf77dd23f separate out the freeze/thaw handling from recovery
(This used to be ctdb commit 0b0640bd8b8334961f240e0cf276ac112cd6e616)
2007-05-12 15:15:27 +10:00
Andrew Tridgell
74a799a83b added lockwait child code for entering recovery mode. A child processes holds lockall locks for the entire recovery process
(This used to be ctdb commit f892f30def75b0d964c35eae38c4cf675597dd28)
2007-05-12 14:34:21 +10:00
Andrew Tridgell
85aff64ed8 fixed debug message
(This used to be ctdb commit 9802bf1ef9104b31977020e803b0f81da71c7169)
2007-05-11 17:29:21 +10:00
Andrew Tridgell
63acf8ab95 - merge from ronnie
- increment rsn only in become_dmaster
- add torture check for rsn regression in ctdb_ltdb_store

(This used to be ctdb commit 8047506a08bb53ee01aa64f25c9f72839e1e2d68)
2007-05-11 10:33:43 +10:00
Ronnie Sahlberg
9eeb4f1a51 we must bump the rsn everytime we do a REQ_DMASTER or a REPLY_DMASTER
to make sure that the "merge records based on rsn during recovery" will
merge correctly.

this is extra important since samba3 never bumps the record when it 
writes new data to it !

(This used to be ctdb commit 857e67204065603592c2dbbadbd8667ebba9ccdb)
2007-05-11 06:08:17 +10:00
Ronnie Sahlberg
325713dfeb make ctdb_control catdb work again
(This used to be ctdb commit 40a8fb68c71be0b9f54ae88bf8aa39a4c71f3b5a)
2007-05-11 05:40:11 +10:00
Andrew Tridgell
f8765b19bf - got rid of the complex hand marshalling in the recovery controls
- fixed the re-send of ctdb calls after a generation change

- fixed a reqid idr leak in controls

- removed the write_record test code

- use the new nonblock lockall code to prevent ctdbd from ever doing a
  blocking lock that could deadlock with smbd

- moved more of the recovery controls into ctdb_recover.c

(This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec)
2007-05-10 17:43:45 +10:00
Andrew Tridgell
15bc97cdaa better timeout handling for calls, controls and traverses
(This used to be ctdb commit 63346a6c59d4821b4c443939b5d88db8cd20f5fe)
2007-05-10 14:06:48 +10:00
Andrew Tridgell
2a82665532 fixed setvnnmap to use wire structures too
(This used to be ctdb commit 1208e4219d220b80e2f74974cac8ed2b8956d3ef)
2007-05-10 08:22:26 +10:00
Andrew Tridgell
682df74d59 separate the wire format and internal format for the vnn_map
(This used to be ctdb commit 9a71718d87c5162f1423d85c2e86a01f6771925e)
2007-05-10 08:13:19 +10:00
Andrew Tridgell
a8f83423f4 moved the vnn_map initialisation out of the cmdline code
(This used to be ctdb commit 81492b840d608dc724d5a25ddef6eb0ce12b95fb)
2007-05-10 07:55:46 +10:00
Andrew Tridgell
ba47b43c6b merged ronnies code to delay client requests when in recovery mode
(This used to be ctdb commit dfca37076d642f3407c63dfe3b685287d27c8f8d)
2007-05-10 07:43:18 +10:00
Ronnie Sahlberg
bbaaf2bbf4 hang the event from the retry structure instead of the hdr structure
(This used to be ctdb commit 8536c8c3a30a986ba4945d02aef82b47495ce3f8)
2007-05-09 14:08:11 +10:00
Ronnie Sahlberg
c938c1b5de when we are in recovery mode and we get a REQ_CALL from a client,
defer it for one second and try again   

(This used to be ctdb commit 606fb6414b97d1813056982cda7c0fe84d746e67)
2007-05-09 14:06:47 +10:00
Andrew Tridgell
d2a90cc5a5 merge from ronnie
(This used to be ctdb commit f67a4842e7b1efb2ad61c41e4895c7698e564bf3)
2007-05-09 11:54:37 +10:00
Ronnie Sahlberg
6929739b7f add a command line flag to ctdbd to start a recovery daemon.
update the recovery test script to start all ctdb daemons with a 
recovery daemon

(This used to be ctdb commit 47794e16df285cacefc30208d892d931a6e46b96)
2007-05-09 09:59:23 +10:00
Andrew Tridgell
fdb8144e62 fixed a problem with the number of timed events growing without bound with the new seqnum code
(This used to be ctdb commit 6109ae3dae8d93c93a2dc76cc561ea6e21458aa6)
2007-05-08 21:16:29 +10:00
Ronnie Sahlberg
e11eebd070 fix alignment bug for pulldb
(This used to be ctdb commit f1188289c18805c2c5f8bae61d73df3fc762faee)
2007-05-08 14:42:00 +10:00
Ronnie Sahlberg
a1866c6eeb hang the timeout event off state and thus we dont need to explicitely
free it   and also we wont accidentally return from the function without 
killing the event first

(This used to be ctdb commit e3d72d024ef7342a808e5c488fd646a39e5fac78)
2007-05-07 07:54:17 +10:00
Ronnie Sahlberg
6bfb5f61ca it now works to talloc_free() the timed event if we no longer want it to
trigger

this must have been a sideeffect of a different bug in the recoverd.c 
code that has now been fixed

(This used to be ctdb commit 676446fd1083c371ad0ff72dd8c636ec8e6d1423)
2007-05-07 07:47:16 +10:00
Ronnie Sahlberg
a9657f6aa5 add new controls to get and set the recovery master node of a daemon
i.e. which node is "elected" to check for and drive recovery

(This used to be ctdb commit d577093eb4b619392c71ab5ce81e8c02565d93f0)
2007-05-07 05:02:48 +10:00
Ronnie Sahlberg
c9aafae5ce dont allocate arrays where we can just return a single integer
(This used to be ctdb commit 07bc338e490e0f7018808a2450bc54863eb88c94)
2007-05-06 08:05:22 +10:00
Ronnie Sahlberg
dceab7ff3e dont use arrays where a uint32_t works just as well
(This used to be ctdb commit 843e974b29c93df891ae7cf13323ee960a334f60)
2007-05-06 07:52:20 +10:00
Ronnie Sahlberg
ad41dff7bf add a ifdeffed out block to the call.
we really should kill the event in case the call completed before the 
timeout   so that we can also make timed_out non-static

(This used to be ctdb commit f297eed589b1d4e188f77f195683365cf91d0e62)
2007-05-06 07:32:16 +10:00
Ronnie Sahlberg
4f2cdc2d8b hte timed_out variable needs to be static and can not be on the stack
since if the command times out and we return from ctdb_control   we may 
have events that can trigger later which will overwrite data that is no 
longer in our stackframe

(This used to be ctdb commit 93942543092be618c0bd8ef68b470b0789bad7ad)
2007-05-06 07:07:47 +10:00
Ronnie Sahlberg
0f6d9c73d8 merge from tridge
(This used to be ctdb commit 08173e3ab77178b9841db0081a51b93291d9e8dc)
2007-05-06 04:38:41 +10:00
Ronnie Sahlberg
25edbc9a50 add a control to get the pid of a daemon.
this makes it possible to kill a specific daemon in the recover test 
script

(This used to be ctdb commit 2fa394b4c80988cb1a6d04b236ec64cc9d9e8a40)
2007-05-06 04:31:22 +10:00
Andrew Tridgell
7d48810645 merged vnn map broadcast from ronnie
(This used to be ctdb commit c0fa029435fdaa0be006b28eddb6b31beb2ee605)
2007-05-05 17:35:28 +10:00
Andrew Tridgell
542b76136e - take advantage of the new EVENT_FD_AUTOCLOSE flag
- use the tdb_chainlock_mark() call to allow us to guarantee forward progress in the ctdb_lockwait code

(This used to be ctdb commit e201e98aad0fef6a779a80f3b1ae7792953e2d6b)
2007-05-05 17:19:59 +10:00
Andrew Tridgell
cc8ac1ca6b allow the events system to be chosen on the command line
(This used to be ctdb commit 2fe976d7a376a763472cc7952a78b6249ce416c8)
2007-05-05 17:18:43 +10:00
Ronnie Sahlberg
2e64727079 merge from tridge
(This used to be ctdb commit 8648104f8d76d22427c14422b126f7e979cc2d95)
2007-05-05 16:51:34 +10:00
Andrew Tridgell
9636c97c5a show number of connected clients in status output
(This used to be ctdb commit 99765bbe327bfe9c43415f4943281458f25be51b)
2007-05-05 14:09:46 +10:00
Ronnie Sahlberg
5cb817f031 split the vnn broadcast address into two
one broadcast address for all nodes
and one broadcast address for all nodes in the current vnnmap

update all useage of the old flag to now only broadcast to the vnnmap
except for tools/ctdb_control where it makes more sense to broadcast to 
all nodes

(This used to be ctdb commit dfb65b88cf67ad9d61268c4b47a6d8ae346f47df)
2007-05-05 13:17:26 +10:00
Andrew Tridgell
410d41480a added a dumpmemory control, used to find memory leaks
(This used to be ctdb commit 44fdafaf421e3e906796d529aed2f7c5df201b94)
2007-05-05 11:03:10 +10:00
Andrew Tridgell
adc64aed0a - fixed a crash bug after client disconnect in ctdb_control
- added total memory used to ctdb_control status output

(This used to be ctdb commit a99ffe4372edc63d83d8c8ebf9a60b3413301f5a)
2007-05-05 08:33:35 +10:00
Andrew Tridgell
d8f4e6b209 - added counters for controls in ctdb_control status
(This used to be ctdb commit 858061372fc9902837a1a5b8bcfc0ada58eec193)
2007-05-05 08:11:54 +10:00
Ronnie Sahlberg
1725fcf294 merge from tridge
(This used to be ctdb commit 62574808ef4dcb76760f1dd2496fbe8e34197c23)
2007-05-05 01:22:30 +10:00
Andrew Tridgell
fccc585f5a added seqnum propogation code to ctdb
(This used to be ctdb commit be2572b1b09eaaa1ea6a726d60f16996f9407d13)
2007-05-04 22:18:00 +10:00
Ronnie Sahlberg
508cafd17e merge from tridge
(This used to be ctdb commit 6c8b90cedc67daa89d54db5268fde18bfc20abaf)
2007-05-04 17:05:28 +10:00
Andrew Tridgell
ed3e847785 added a ctdb control for enabling the tdb seqnum
(This used to be ctdb commit c66920d9fb08a4a33418e2c1dcf1fc320fba3761)
2007-05-04 15:33:28 +10:00
Ronnie Sahlberg
7dfdab1b9d recovery daemon
this program is a client to the local ctdb daemon

every second it pulls all vnnmap and nodemaps from all nodes that are 
available and checks if a recovery is required

a recovery is required if :
* all nodes do NOT have an identical vnnmap and generation
* all nodes do NOT have an identical nodemap
* there are active nodes that are NOT in the nodemap
* there are nodes in the nodemap that are NOT active

During recovery,  the recovery tool will also make sure that all nodes 
know about and have created all databases.

(This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26)
2007-05-04 15:21:40 +10:00
Andrew Tridgell
f2fd53056d nicer interface to ctdb traverse
(This used to be ctdb commit e5ce866dcc5037b5069e42bf1e168b646f007b01)
2007-05-04 12:18:39 +10:00
Andrew Tridgell
e752f3bd97 - changed the REQ_REGISTER PDU to be a control
- allow controls to know which client invoked them

- added a client_id to clients, so they can be identified remotely

- added the ability to remove registered srvids

- in the list_keys code, register a temp srvid, then remove it afterwards

(This used to be ctdb commit 29603c51cc6d81362532cd8e50f75c8360c5f5ef)
2007-05-04 11:41:29 +10:00
Ronnie Sahlberg
2b1714a521 update getvnnmap control to take a timeout parameter
dont explicitely free the vnnmap pointer in the getvnnmap control  this 
is freed by the mem_ctx instead

add code to the recoverd to detect when/if recovery is required
veiry that the number of active nodes, the nodemap and the vnn map is 
consistent across the entire cluster and if not   trigger a recovery 
(which right now just prints "we need to do recovery" to the screen.

(This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5)
2007-05-04 09:45:53 +10:00
Ronnie Sahlberg
ae73784c28 change the signature for ctdb_ctrl_getnodemap() so that a timeout
parameter is added.
change ctdb_get_connected_nodes in the same way

(This used to be ctdb commit d85f23bcf4c1230225abb2f4a053c70b68d469aa)
2007-05-04 09:01:01 +10:00
Ronnie Sahlberg
be17d4d181 ctdb_control should use the provided timeout and not hardcode to 1.0
seconds

(This used to be ctdb commit 03acb2f450578f6195ab2d0a598f6720e33e7cfb)
2007-05-04 08:32:02 +10:00
Ronnie Sahlberg
ebc478749b start working on a recovery daemon
change ctdb_control so it takes a timeval pointer as argument.
this is the timeout. if the node has not responded within hte timeout
ctdb_control will return an error instead of hanging.
if the timeval pointer is NULL then the call will block indefinitely if 
there is no response.

this is used for now in the createdb control   but all the helpers 
ctdb_ctrl_* should probably be updated to take a timeout parameter as 
well.

(This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211)
2007-05-04 08:30:18 +10:00
Ronnie Sahlberg
63f42d3ff8 merge from tridge
(This used to be ctdb commit fb8ac93c7dfc11e774ef1ce05b0d0df1de56a621)
2007-05-03 17:16:38 +10:00