1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-27 14:04:05 +03:00

767 Commits

Author SHA1 Message Date
Ronnie Sahlberg
5efa3d88c5 we must repoint dmaster to an invalid node during recovery to stop the
shortcut from working

(This used to be ctdb commit 5e18930be8c0efb87aa9e2780d9457634b24e156)
2007-05-08 14:51:55 +10:00
Ronnie Sahlberg
e11eebd070 fix alignment bug for pulldb
(This used to be ctdb commit f1188289c18805c2c5f8bae61d73df3fc762faee)
2007-05-08 14:42:00 +10:00
Ronnie Sahlberg
54d2acec40 merge from tridge
(This used to be ctdb commit da8636707547e77c76dc7e368ddfae35b8a21402)
2007-05-07 08:07:26 +10:00
Andrew Tridgell
d98d8d4de3 merged from ronnie
(This used to be ctdb commit 49aad9fb09ca2c787e6f82ba03cb229cc51844f0)
2007-05-07 07:56:38 +10:00
Ronnie Sahlberg
a1866c6eeb hang the timeout event off state and thus we dont need to explicitely
free it   and also we wont accidentally return from the function without 
killing the event first

(This used to be ctdb commit e3d72d024ef7342a808e5c488fd646a39e5fac78)
2007-05-07 07:54:17 +10:00
Ronnie Sahlberg
6bfb5f61ca it now works to talloc_free() the timed event if we no longer want it to
trigger

this must have been a sideeffect of a different bug in the recoverd.c 
code that has now been fixed

(This used to be ctdb commit 676446fd1083c371ad0ff72dd8c636ec8e6d1423)
2007-05-07 07:47:16 +10:00
Ronnie Sahlberg
39d81cffb1 recovery daemon with recovery master election
election is primitive, it elects the lowest vnn as the recovery master

two new controls, to get/set recovery master for a node



to use recovery daemon,   start one  
./bin/recoverd --socket=ctdb.socket*
for each ctdb daemon


it has been briefly tested by deleting and adding nodes to a 4 node 
cluster but needs more testing

(This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3)
2007-05-07 06:51:58 +10:00
Ronnie Sahlberg
a9657f6aa5 add new controls to get and set the recovery master node of a daemon
i.e. which node is "elected" to check for and drive recovery

(This used to be ctdb commit d577093eb4b619392c71ab5ce81e8c02565d93f0)
2007-05-07 05:02:48 +10:00
Ronnie Sahlberg
97bc457321 add a test in the function that checks whether the cluster needs
recovery or not  that all active nodes are in normal mode.
If we discover that some node is still in recoverymode it may indicate 
that a previous recovery ended prematurely and thus we should start a 
new recovery 

(This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0)
2007-05-07 04:41:12 +10:00
Ronnie Sahlberg
1c438a7256 update a comment to be more desciptive
(This used to be ctdb commit 96082c54d830974bf9a4d5bad33ad60379a85798)
2007-05-06 12:46:56 +10:00
Ronnie Sahlberg
1fa2bf831a change a lot of printf into debug statements
(This used to be ctdb commit 6edb9149c7eb36da47e4e6a9dd3ede22263ce3f9)
2007-05-06 10:51:25 +10:00
Ronnie Sahlberg
8a12672992 break out the code to update all nodes to the new vnnmap into a helper
function

(This used to be ctdb commit 81d39177949b54715710907d14ddc888dc09b064)
2007-05-06 10:42:18 +10:00
Ronnie Sahlberg
ee83202da6 create a helper function for recovery to push all local databases out
onto the remote nodes

(This used to be ctdb commit 1ba76d374652cfa29e56fb77c7190349e42d3bcc)
2007-05-06 10:38:44 +10:00
Ronnie Sahlberg
5fb41f4c3b add an extra blank line
(This used to be ctdb commit 75096dde58df6532abbf5b9ebd771e8810156483)
2007-05-06 10:30:18 +10:00
Ronnie Sahlberg
9281cb192c break the code that repoints dmaster for all local and remote records
into a separate helper function

(This used to be ctdb commit d5ab30d0ac21e736eb34eaa19bccfee5f0ce7cfb)
2007-05-06 10:22:13 +10:00
Ronnie Sahlberg
d51a19f2ba create a helper function for recovery that pulls and merges all remote
databases onto the local node

(This used to be ctdb commit 5cecc47449c369f91e83389a94b987ac32b1e3f4)
2007-05-06 10:16:48 +10:00
Ronnie Sahlberg
d6ce023c68 create a helper function to make sure the local node that does recovery
has all the databases that exist on any other remote node

(This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4)
2007-05-06 10:12:42 +10:00
Ronnie Sahlberg
0e436f5058 add a helper function to create all missing remote databases detected
during recovery

(This used to be ctdb commit 04758c6f7d8f61260be6d2472380cb7904984427)
2007-05-06 10:04:37 +10:00
Ronnie Sahlberg
cadfb24b41 break out the setting/clearing of recovery mode into a dedicated helper
function

(This used to be ctdb commit dba4e4f8aa4f2fde1e9f8d93bdf3a33f7de8ce18)
2007-05-06 09:53:12 +10:00
Ronnie Sahlberg
c9aafae5ce dont allocate arrays where we can just return a single integer
(This used to be ctdb commit 07bc338e490e0f7018808a2450bc54863eb88c94)
2007-05-06 08:05:22 +10:00
Ronnie Sahlberg
dceab7ff3e dont use arrays where a uint32_t works just as well
(This used to be ctdb commit 843e974b29c93df891ae7cf13323ee960a334f60)
2007-05-06 07:52:20 +10:00
Ronnie Sahlberg
ad41dff7bf add a ifdeffed out block to the call.
we really should kill the event in case the call completed before the 
timeout   so that we can also make timed_out non-static

(This used to be ctdb commit f297eed589b1d4e188f77f195683365cf91d0e62)
2007-05-06 07:32:16 +10:00
Ronnie Sahlberg
4f2cdc2d8b hte timed_out variable needs to be static and can not be on the stack
since if the command times out and we return from ctdb_control   we may 
have events that can trigger later which will overwrite data that is no 
longer in our stackframe

(This used to be ctdb commit 93942543092be618c0bd8ef68b470b0789bad7ad)
2007-05-06 07:07:47 +10:00
Ronnie Sahlberg
c6bd23ee11 update to rhe recovery daemon
ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the 
cluster it crashes the recovery daemon afterwards with a SEGV but no 
useful stack backtrace

(This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c)
2007-05-06 06:58:01 +10:00
Ronnie Sahlberg
60d4b0e8b4 in the recover test
start the daemons with explicit socketnames and explicit ip address/port

remove all --socket=  from all ctdb_control calls since they are not 
needed anymore

(This used to be ctdb commit 593a959d428f5b4a913117a9b5c8fe65a3eb950e)
2007-05-06 06:06:39 +10:00
Ronnie Sahlberg
7bbcc964f2 add support in catdb to dump the content of a specific nodes tdb instead
of traversing the full cluster.
this makes it easier to debug recovery

update the test script for recovery to reflect the newish signatures to
ctdb_control



the catdb control does still segfault however when there are missing 
nodes in the cluster   as there are toward the end of the recovery test

(This used to be ctdb commit 8de2a97c14a444f817ceb36461314f10c9601ecc)
2007-05-06 05:53:15 +10:00
Ronnie Sahlberg
0f6d9c73d8 merge from tridge
(This used to be ctdb commit 08173e3ab77178b9841db0081a51b93291d9e8dc)
2007-05-06 04:38:41 +10:00
Ronnie Sahlberg
25edbc9a50 add a control to get the pid of a daemon.
this makes it possible to kill a specific daemon in the recover test 
script

(This used to be ctdb commit 2fa394b4c80988cb1a6d04b236ec64cc9d9e8a40)
2007-05-06 04:31:22 +10:00
Andrew Tridgell
a3c70ac520 merge relevant lib code from samba4
(This used to be ctdb commit 8076a7c7e12da6d59bae31a2e4a0267d87c7b1b3)
2007-05-05 17:46:54 +10:00
Andrew Tridgell
7d48810645 merged vnn map broadcast from ronnie
(This used to be ctdb commit c0fa029435fdaa0be006b28eddb6b31beb2ee605)
2007-05-05 17:35:28 +10:00
Andrew Tridgell
542b76136e - take advantage of the new EVENT_FD_AUTOCLOSE flag
- use the tdb_chainlock_mark() call to allow us to guarantee forward progress in the ctdb_lockwait code

(This used to be ctdb commit e201e98aad0fef6a779a80f3b1ae7792953e2d6b)
2007-05-05 17:19:59 +10:00
Andrew Tridgell
cc8ac1ca6b allow the events system to be chosen on the command line
(This used to be ctdb commit 2fe976d7a376a763472cc7952a78b6249ce416c8)
2007-05-05 17:18:43 +10:00
Andrew Tridgell
24ed74a454 use the new lib/events autoconf code
(This used to be ctdb commit fec779711e8c4d6e047d792aee744e60e5a9f67c)
2007-05-05 17:18:06 +10:00
Andrew Tridgell
3bfeb3d235 - added a EVENT_FD_AUTOCLOSE flag that allows you to tell the event system to close the fd automatically when a fd_event is freed. This prevents races which can lead to epoll missing events
- added autoconf rules for automatically building with epoll support

(This used to be ctdb commit 4d113298b26f7163992f2e47429c953bd4f957c9)
2007-05-05 17:17:25 +10:00
Andrew Tridgell
d903e9542d added tdb_chainlock_mark() call, which can be used to mark a chain locked without actually locking it. This will be used to guarantee forward progress in the ctdb non-blocking lockwait code
(This used to be ctdb commit 2af98c3418496b39106c7282f550049ec8239657)
2007-05-05 17:14:33 +10:00
Ronnie Sahlberg
2e64727079 merge from tridge
(This used to be ctdb commit 8648104f8d76d22427c14422b126f7e979cc2d95)
2007-05-05 16:51:34 +10:00
Andrew Tridgell
9636c97c5a show number of connected clients in status output
(This used to be ctdb commit 99765bbe327bfe9c43415f4943281458f25be51b)
2007-05-05 14:09:46 +10:00
Ronnie Sahlberg
5cb817f031 split the vnn broadcast address into two
one broadcast address for all nodes
and one broadcast address for all nodes in the current vnnmap

update all useage of the old flag to now only broadcast to the vnnmap
except for tools/ctdb_control where it makes more sense to broadcast to 
all nodes

(This used to be ctdb commit dfb65b88cf67ad9d61268c4b47a6d8ae346f47df)
2007-05-05 13:17:26 +10:00
Ronnie Sahlberg
86b78a5d04 merge from tridge
(This used to be ctdb commit ea45ffe3fe479d49b6ed47cb545ee662655f6187)
2007-05-05 11:46:44 +10:00
Andrew Tridgell
410d41480a added a dumpmemory control, used to find memory leaks
(This used to be ctdb commit 44fdafaf421e3e906796d529aed2f7c5df201b94)
2007-05-05 11:03:10 +10:00
Andrew Tridgell
adc64aed0a - fixed a crash bug after client disconnect in ctdb_control
- added total memory used to ctdb_control status output

(This used to be ctdb commit a99ffe4372edc63d83d8c8ebf9a60b3413301f5a)
2007-05-05 08:33:35 +10:00
Andrew Tridgell
d8f4e6b209 - added counters for controls in ctdb_control status
(This used to be ctdb commit 858061372fc9902837a1a5b8bcfc0ada58eec193)
2007-05-05 08:11:54 +10:00
Andrew Tridgell
bdad1edcb5 merged from ronnie
(This used to be ctdb commit 88f0977f303836b50aa9239a9eb3447646bc1e3f)
2007-05-05 07:39:23 +10:00
Ronnie Sahlberg
1725fcf294 merge from tridge
(This used to be ctdb commit 62574808ef4dcb76760f1dd2496fbe8e34197c23)
2007-05-05 01:22:30 +10:00
Andrew Tridgell
fccc585f5a added seqnum propogation code to ctdb
(This used to be ctdb commit be2572b1b09eaaa1ea6a726d60f16996f9407d13)
2007-05-04 22:18:00 +10:00
Ronnie Sahlberg
508cafd17e merge from tridge
(This used to be ctdb commit 6c8b90cedc67daa89d54db5268fde18bfc20abaf)
2007-05-04 17:05:28 +10:00
Andrew Tridgell
ed3e847785 added a ctdb control for enabling the tdb seqnum
(This used to be ctdb commit c66920d9fb08a4a33418e2c1dcf1fc320fba3761)
2007-05-04 15:33:28 +10:00
Andrew Tridgell
6bc3758082 added a tdb_enable_seqnum() function
(This used to be ctdb commit 1f89da231c6637e339d5da156d6a48340706fe61)
2007-05-04 15:29:10 +10:00
Ronnie Sahlberg
418cb36d32 remove a exit from the test script
(This used to be ctdb commit 4adb61f8270dbd15732bc458d49a66138dd240cc)
2007-05-04 15:25:57 +10:00
Ronnie Sahlberg
7dfdab1b9d recovery daemon
this program is a client to the local ctdb daemon

every second it pulls all vnnmap and nodemaps from all nodes that are 
available and checks if a recovery is required

a recovery is required if :
* all nodes do NOT have an identical vnnmap and generation
* all nodes do NOT have an identical nodemap
* there are active nodes that are NOT in the nodemap
* there are nodes in the nodemap that are NOT active

During recovery,  the recovery tool will also make sure that all nodes 
know about and have created all databases.

(This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26)
2007-05-04 15:21:40 +10:00