1
0
mirror of https://github.com/samba-team/samba.git synced 2025-02-04 17:47:26 +03:00

222 Commits

Author SHA1 Message Date
Andrew Tridgell
cc4d8102cd moved system specific ip code to system.c
(This used to be ctdb commit 9de9e4ccda9665108baac12a8716b189d26340b1)
2007-05-26 14:01:08 +10:00
Andrew Tridgell
31053286c5 keep sending ARPs for 2 minutes, every 5 seconds
(This used to be ctdb commit d5223f2eed4a762b93a101c720286568578ce7ed)
2007-05-25 21:27:26 +10:00
Andrew Tridgell
7a9e40b288 consider a node dead after 6 seconds, not 15
(This used to be ctdb commit b055907f0bd2fa0e83bd84e49039fa868905b941)
2007-05-25 20:00:06 +10:00
Andrew Tridgell
56e3eed3d1 added IP takeover logic for public IPs to ctdb
(This used to be ctdb commit 374adb729472670f35cef41269b8719f49c0de0e)
2007-05-25 17:04:13 +10:00
Ronnie Sahlberg
2b6c39a0af add controls to take over and release an ip address
add sending of grat arp     both normal grat arp (request) and also
unsolicited grat arp replies

(This used to be ctdb commit 7305c00c21c30bdbafc3722a018513378bd307e6)
2007-05-25 13:05:25 +10:00
Andrew Tridgell
7596347844 make ctdbd realtime if possible
(This used to be ctdb commit 8852f6cca52b64a5239c83ab7c6a99ae4edb2597)
2007-05-24 14:52:10 +10:00
Andrew Tridgell
70912e2b0c added automatic vacuuming of empty records during recovery
(This used to be ctdb commit f9181a784ac7009df5e9c996f4e0c3e99098b59a)
2007-05-23 17:21:14 +10:00
Andrew Tridgell
74bf76ca10 merge from ronnie
(This used to be ctdb commit 267481b67152bc5885884d223085aa9ef5fe73bd)
2007-05-23 14:50:41 +10:00
Andrew Tridgell
76b2822340 - startup frozen, and do an initial recovery
- fixed a bug in traverse
- get a lock on the node list file in the recmaster recovery daemon

(This used to be ctdb commit 162a5647535ad1cb3e8e5d4042a2784365fb1913)
2007-05-23 14:35:19 +10:00
Andrew Tridgell
9f7a70657f start ctdb frozen, and let the election sort things out. This prevents a race on startup
(This used to be ctdb commit b788ed3fa64e31e517b4e602e8bd3ae7201ecddd)
2007-05-23 12:23:07 +10:00
Ronnie Sahlberg
e989a1bac8 add controls to enable/disable the monitoring of dead nodes
(This used to be ctdb commit 79d29c39bb81feb069db3fc6d3d392c1e75a4d13)
2007-05-21 09:24:34 +10:00
Andrew Tridgell
d549f1e1a3 merge from ronnie
(This used to be ctdb commit 985d718e03510398b9a5cfdf6a4d559a90738a11)
2007-05-19 17:21:58 +10:00
Ronnie Sahlberg
02a9f1b0a0 use ctdb_dead_node() instead of reimplementing the same code again
this leaves only one single function where a node is marked as dead
instead of two places

(This used to be ctdb commit aa764ea26cc26d5c1ae188105236da603576f45b)
2007-05-19 16:59:10 +10:00
Andrew Tridgell
a14fd9d29c make sure we don't increment rx_cnt for redirected packets, or for packets that have been requeued after a lockwait
(This used to be ctdb commit 92e5569407dba173a27e9645b4339ce3e2c00520)
2007-05-19 13:45:24 +10:00
Ronnie Sahlberg
9f7b9faf64 add a node->tx_cnt counter
only send keepalive packets if the count is zero

(This used to be ctdb commit 2cbd424231caccf0a531cf6501761115efe68f3e)
2007-05-19 10:20:19 +10:00
Andrew Tridgell
28f2fc669b a better way to resend calls after recovery
(This used to be ctdb commit 444f52e134fc22aaf254d05c86d8b357ded876f4)
2007-05-19 00:56:49 +10:00
Andrew Tridgell
049e1504ee timeout pending controls immediately when a node becomes disconnected
(This used to be ctdb commit 93c4b16f4efef383ba8db83953019ef4821613e0)
2007-05-18 23:48:29 +10:00
Andrew Tridgell
346dfc1bef - up rx_cnt on all packet types
- notice when a node becomes available again

(This used to be ctdb commit e05110dd6112e81f224937dfd7370d963ce9531a)
2007-05-18 23:23:36 +10:00
Ronnie Sahlberg
db4c479568 add dead node detection so that if a node does not generate any
keepalive traffic for x seconds   it is deemed dead


this triggers a recovery after a while if a ctdbd has been STOPPED    
but it doesnt recover automatically when the node reappears

(This used to be ctdb commit d6324afe0d13b5e21d06e347caca433c6b36a32a)
2007-05-18 19:19:35 +10:00
Andrew Tridgell
49fe66713f - don't try to send controls to dead nodes
- use only connected nodes in a traverse

(This used to be ctdb commit 9a676dd5d331022d946a56c52c42fc6985b93dbc)
2007-05-17 23:23:41 +10:00
Andrew Tridgell
874fd5c2f7 removed the CTDB_CTRL_FLAG_NOREQUEUE flag
(This used to be ctdb commit 366e849f6f350eda78d79cf1ea55c2637e605c86)
2007-05-17 14:10:38 +10:00
Ronnie Sahlberg
cc760cf13a add a control to shutdown/kill a node
(This used to be ctdb commit 3802f7304fd59d56062c855987e2561753e85a69)
2007-05-17 10:45:31 +10:00
Andrew Tridgell
c105f6d789 - merge from ronnie
- fixed a memory leak found by dmitry

(This used to be ctdb commit ae87bf0005666b50850161c3843d6bc7cb5c8971)
2007-05-16 18:10:26 +10:00
Ronnie Sahlberg
a4ebb6d5ef if a caller specifies a timeout when calling a control, it makes no
sense to have the daemon requeue the packets if they timeout or fail to 
deliver to the remote node

(This used to be ctdb commit 9fb753046787190970654aeb937e96685ac53184)
2007-05-16 12:34:30 +10:00
Andrew Tridgell
a5198559c9 moved the recovery daemon into the main ctdbd and enable it by default
(This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0)
2007-05-15 15:13:36 +10:00
Andrew Tridgell
c6afe22b92 added a control to get the local vnn
(This used to be ctdb commit 0b109f574b710f290372512d0694290ea7cd4368)
2007-05-15 10:17:16 +10:00
Andrew Tridgell
81826da2df added error messages in ctdb_control replies
(This used to be ctdb commit bd848f5b760e6b2a73ebfc67fd8adb3c31479fb5)
2007-05-12 21:25:26 +10:00
Andrew Tridgell
36ccc10389 make sure we ignore requeued ctdb_call packets of older generations except for packets from the client
(This used to be ctdb commit facab105fbd7fe50f96bdd763ae50ddc54fbdacc)
2007-05-12 18:08:50 +10:00
Andrew Tridgell
2c90d9e794 show total frozen/recoving in status
(This used to be ctdb commit 0d0eb66a63fe6912edb85bf7387ac76acb70babd)
2007-05-12 15:51:08 +10:00
Andrew Tridgell
9cf77dd23f separate out the freeze/thaw handling from recovery
(This used to be ctdb commit 0b0640bd8b8334961f240e0cf276ac112cd6e616)
2007-05-12 15:15:27 +10:00
Andrew Tridgell
74a799a83b added lockwait child code for entering recovery mode. A child processes holds lockall locks for the entire recovery process
(This used to be ctdb commit f892f30def75b0d964c35eae38c4cf675597dd28)
2007-05-12 14:34:21 +10:00
Andrew Tridgell
f8765b19bf - got rid of the complex hand marshalling in the recovery controls
- fixed the re-send of ctdb calls after a generation change

- fixed a reqid idr leak in controls

- removed the write_record test code

- use the new nonblock lockall code to prevent ctdbd from ever doing a
  blocking lock that could deadlock with smbd

- moved more of the recovery controls into ctdb_recover.c

(This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec)
2007-05-10 17:43:45 +10:00
Andrew Tridgell
15bc97cdaa better timeout handling for calls, controls and traverses
(This used to be ctdb commit 63346a6c59d4821b4c443939b5d88db8cd20f5fe)
2007-05-10 14:06:48 +10:00
Andrew Tridgell
31cd92dc7e merge from ronnie
(This used to be ctdb commit 92b7a849565730744c75a7fb776173554e9f57bf)
2007-05-10 13:15:58 +10:00
Andrew Tridgell
682df74d59 separate the wire format and internal format for the vnn_map
(This used to be ctdb commit 9a71718d87c5162f1423d85c2e86a01f6771925e)
2007-05-10 08:13:19 +10:00
Andrew Tridgell
fdb8144e62 fixed a problem with the number of timed events growing without bound with the new seqnum code
(This used to be ctdb commit 6109ae3dae8d93c93a2dc76cc561ea6e21458aa6)
2007-05-08 21:16:29 +10:00
Ronnie Sahlberg
a9657f6aa5 add new controls to get and set the recovery master node of a daemon
i.e. which node is "elected" to check for and drive recovery

(This used to be ctdb commit d577093eb4b619392c71ab5ce81e8c02565d93f0)
2007-05-07 05:02:48 +10:00
Ronnie Sahlberg
25edbc9a50 add a control to get the pid of a daemon.
this makes it possible to kill a specific daemon in the recover test 
script

(This used to be ctdb commit 2fa394b4c80988cb1a6d04b236ec64cc9d9e8a40)
2007-05-06 04:31:22 +10:00
Ronnie Sahlberg
2e64727079 merge from tridge
(This used to be ctdb commit 8648104f8d76d22427c14422b126f7e979cc2d95)
2007-05-05 16:51:34 +10:00
Andrew Tridgell
9636c97c5a show number of connected clients in status output
(This used to be ctdb commit 99765bbe327bfe9c43415f4943281458f25be51b)
2007-05-05 14:09:46 +10:00
Ronnie Sahlberg
5cb817f031 split the vnn broadcast address into two
one broadcast address for all nodes
and one broadcast address for all nodes in the current vnnmap

update all useage of the old flag to now only broadcast to the vnnmap
except for tools/ctdb_control where it makes more sense to broadcast to 
all nodes

(This used to be ctdb commit dfb65b88cf67ad9d61268c4b47a6d8ae346f47df)
2007-05-05 13:17:26 +10:00
Andrew Tridgell
410d41480a added a dumpmemory control, used to find memory leaks
(This used to be ctdb commit 44fdafaf421e3e906796d529aed2f7c5df201b94)
2007-05-05 11:03:10 +10:00
Andrew Tridgell
adc64aed0a - fixed a crash bug after client disconnect in ctdb_control
- added total memory used to ctdb_control status output

(This used to be ctdb commit a99ffe4372edc63d83d8c8ebf9a60b3413301f5a)
2007-05-05 08:33:35 +10:00
Andrew Tridgell
d8f4e6b209 - added counters for controls in ctdb_control status
(This used to be ctdb commit 858061372fc9902837a1a5b8bcfc0ada58eec193)
2007-05-05 08:11:54 +10:00
Ronnie Sahlberg
1725fcf294 merge from tridge
(This used to be ctdb commit 62574808ef4dcb76760f1dd2496fbe8e34197c23)
2007-05-05 01:22:30 +10:00
Andrew Tridgell
fccc585f5a added seqnum propogation code to ctdb
(This used to be ctdb commit be2572b1b09eaaa1ea6a726d60f16996f9407d13)
2007-05-04 22:18:00 +10:00
Andrew Tridgell
ed3e847785 added a ctdb control for enabling the tdb seqnum
(This used to be ctdb commit c66920d9fb08a4a33418e2c1dcf1fc320fba3761)
2007-05-04 15:33:28 +10:00
Ronnie Sahlberg
7dfdab1b9d recovery daemon
this program is a client to the local ctdb daemon

every second it pulls all vnnmap and nodemaps from all nodes that are 
available and checks if a recovery is required

a recovery is required if :
* all nodes do NOT have an identical vnnmap and generation
* all nodes do NOT have an identical nodemap
* there are active nodes that are NOT in the nodemap
* there are nodes in the nodemap that are NOT active

During recovery,  the recovery tool will also make sure that all nodes 
know about and have created all databases.

(This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26)
2007-05-04 15:21:40 +10:00
Andrew Tridgell
e752f3bd97 - changed the REQ_REGISTER PDU to be a control
- allow controls to know which client invoked them

- added a client_id to clients, so they can be identified remotely

- added the ability to remove registered srvids

- in the list_keys code, register a temp srvid, then remove it afterwards

(This used to be ctdb commit 29603c51cc6d81362532cd8e50f75c8360c5f5ef)
2007-05-04 11:41:29 +10:00
Ronnie Sahlberg
ebc478749b start working on a recovery daemon
change ctdb_control so it takes a timeval pointer as argument.
this is the timeout. if the node has not responded within hte timeout
ctdb_control will return an error instead of hanging.
if the timeval pointer is NULL then the call will block indefinitely if 
there is no response.

this is used for now in the createdb control   but all the helpers 
ctdb_ctrl_* should probably be updated to take a timeout parameter as 
well.

(This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211)
2007-05-04 08:30:18 +10:00