1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-06 13:18:07 +03:00
Commit Graph

80 Commits

Author SHA1 Message Date
Sumit Bose
11988fc77a structure member node_list_file is not used anywhere
(This used to be ctdb commit 0e84ea23d1d998d4d4ac7d8a858b3d8294f056cb)
2009-05-21 11:16:43 +10:00
Ronnie Sahlberg
28bbe2f407 dont call ctdb_fatal() just because we are asked to restart a connection
to a remote node and ctdb->methods is NULL.

This can happen when we are in the middle of a normal shutdown of the
daemon and we have already shut down the transport layer (thus setting
ctdb->methods == NULL in the transport layer destructor)
band there is some unprocessed data related to a remote node.

This prevents an ugly race condition where ctdb might sometimes (rare)
cause a core dump during "ctdb shutdown".

(This used to be ctdb commit fc4e8b5a5d3699221620a8d76701c8589f2b4ff1)
2008-12-17 12:04:41 +11:00
Ronnie Sahlberg
374906860c from Michael Adams : allow #-style comments in the nodes and public
addresses file

(This used to be ctdb commit 5f96b33a379c80ed8a39de1ee41f254cf48733f9)
2008-10-07 19:25:10 +11:00
Ronnie Sahlberg
2003196816 we need a 'case x:' in our ugly 'encode the control opcode as a linenumber in valgrind output' hack to make it work
(This used to be ctdb commit f4929e164be1703f74fc332e740b85cfe1ae3e73)
2008-07-07 08:52:04 +10:00
Andrew Tridgell
9999f18369 an extraordinarily ugly patch!
This is a hack to allow backtraces under valgrind to show what opcode
is getting uninitialised bytes

(This used to be ctdb commit 67bb12c8f0af5914efb44b76bc6ddbb11fc0fcdf)
2008-07-04 18:00:24 +10:00
Ronnie Sahlberg
adf40341a7 ctdb->methods becomes NULL when we shutdown the transport.
If we shutdown the transport   and CTDB later decides to send a command out
for queueing, the call to ctdb->methods->allocate_pkt() will SEGV.

This could trigger for example when we are in the process of shuttind down CTDBD and have already shutdown the transport but we are still waiting for the
"shutdown" eventscripts to finish.
If the event scripts now take much much longer to execute for some reason, this
race condition becomes much more probable.

Decorate all dereferencing of ctdb->methods->    with a check that ctdb->menthods is non-NULL

(This used to be ctdb commit c4c2c53918da6fb566d6e9cbd6b02e61ae2921e7)
2008-05-11 14:28:33 +10:00
Ronnie Sahlberg
f3b474cffb Add debug output to indicate why a node starts up in DISABLED state
(This used to be ctdb commit 8df75775966ead36e1073896fedeff674a6e0587)
2008-02-22 09:52:57 +11:00
Ronnie Sahlberg
39539f6044 Add a new parameter to /etc/sysconfig/ctdb
CTDB_START_AS_DISABLED="yes"

and command line argument
--start-as-disabled

When set, this makes the ctdb node to always start in DISABLED mode and will thus not host any public ip addresses.
The administrator must manually "ctdb enable" the node after it has started when the administrator wants the node to start hosting public ip addresses.

Using this option it is possible to start ctdb on a node without causing any reallocation of ip addresses when it is starting. The node will still merge with the cluster and there will still be a recovery phase but the ip address allocations will not change in the cluster.

(This used to be ctdb commit b93d29f43f5306c244c887b54a77bca8a061daf2)
2008-02-22 09:42:52 +11:00
Ronnie Sahlberg
9f99b44fd1 to make it easier/less disruptive to add nodes to a running cluster
add a new control that causes the node to drop the current nodes list
and reread it from the nodes file.
During this operation, the node will also drop the tcp layer and restart it.

When we drop the tcp layer, by talloc_free()ing the ctcp structure
add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer

add two new commands for the ctdb tool.
one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file

(This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)
2008-02-19 14:44:48 +11:00
Andrew Tridgell
f6e53f433b merge from ronnie
(This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)
2008-02-04 20:07:15 +11:00
Andrew Tridgell
9d6ac0cf55 added debug constants to allow for better mapping to syslog levels
(This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502)
2008-02-04 17:44:24 +11:00
Andrew Tridgell
b62b7fcde8 added syslog support, and use a pipe to catch logging from child processes to the ctdbd logging functions
(This used to be ctdb commit 1306b04cd01e996fd1aa1159a9521f2ff7b06165)
2008-01-16 22:03:01 +11:00
Ronnie Sahlberg
9e73dc87cc Add a --node-ip argument so that one can specify which ip address a
specific instance of ctdbd should bind to. This helps when running a
"virtual" cluster on a single machine where all instcances bind to 
different alias interfaces.

If --node-ip is specified, then we will only try to bind to this ip 
address only. Othervise we fall back to the original method trying the
ip addresses in /etc/ctdb/nodes one by one until we find one we can bind 
to.

No variable in /etc/sysconfig/ctdb added since this parameter only makes 
sense in a virtual test/debug cluster.

(This used to be ctdb commit d96cb02c2c24f9eabbc53d3d38e90dea49cff3e0)
2007-11-26 10:52:55 +11:00
Ronnie Sahlberg
3c1f9882a8 revert 773
(This used to be ctdb commit 5a1c8f458ddc9b0ff532afda6007e32db10a71c8)
2007-11-12 10:23:35 +11:00
Ronnie Sahlberg
df5dd43e7c add a new tunable "CheckNodesFile" that when set to 0 will disable the
check in the recovery daemon that all nodes are using the same 
/etc/ctdb/nodes file.

Also add some more missing checks that the pnn used is a valid pnn 
before using it to dereferencing the ctdb->nodes array


This is useful since it allows us to add more physical nodes to a an 
existing cluster without having to bring down the entire cluster.

The to add an additional node to an existing cluster would then be
1, on all nodes set CheckNodesFile=0 using 'ctdb setvar'
2, on all nodes add CTDB_SET_CheckNodesFile=0 to /etc/sysconfig/ctdb
For each each node, one at a time :
3, use 'ctdb disable' to stop the hosted services
4, service ctdb stop
5, service ctdb start
Once all nodes have been restarted 
6, on all nodes remove CTDB_SET_CheckNodesFile=0 from 
/etc/sysconfig/ctdb
7, on all nodes set CheckNodesFile=0 using 'ctdb setvar'

8, configure and start up the new node

During this procedure, only one node at a time was brought 
down/restarted and was so only for a short period.

(This used to be ctdb commit 462501a32143e943ce350bd904a47c0955414a51)
2007-11-05 13:36:11 +11:00
Ronnie Sahlberg
d1ba047b7f add a new transport method so that when a node is marked as dead, we
shut down and restart the transport

othervise, if we use the tcp transport the tcp connection might try to 
retransmit the queued data during the time the node is unavailable.
this together with the exponential backoff for tcp means that the tcp 
connection quickly reaches the maximum backoff rto which is often 60 or 
120 seconds.   this would mean that it could take up to 60/120 seconds 
before the tcp layer detects that the connection is dead and it has to 
be reestablished.

(This used to be ctdb commit 0256db470879ce556b0f00070f7ebeaf37e529ab)
2007-10-19 08:58:30 +10:00
Ronnie Sahlberg
755511d28d set the flags explicitely isnstead of masking them in
(This used to be ctdb commit 27a5f9dead44890683f9dbc4f07cda11264aa03b)
2007-10-18 16:54:00 +10:00
Andrew Tridgell
011a205b86 make sure reconnected nodes start off as unhealthy so they don't get a public IP
(This used to be ctdb commit c733ec6760cae01ce277f491caf1355e46de5cf7)
2007-10-10 10:45:22 +10:00
Andrew Tridgell
b87ddd9148 no longer wait at startup for services to become available, instead
set the node initially unhealthy and let the status monitoring bring the node online.
This fixes a problem with winbindd, where it refused to start because secrets.tdb was not populated
but we could not populate ctdbd, because the net command would not run while ctdbd was still doing startup
and thus frozen
(This used to be ctdb commit 3a001b793dd76fb96addf1e2ccb74da326fbcfbc)
2007-09-24 10:00:14 +10:00
Andrew Tridgell
c60988325d added support for persistent databases in ctdbd
(This used to be ctdb commit 3115090a0d882beca9d70761130b74bb0821f201)
2007-09-21 12:24:02 +10:00
Ronnie Sahlberg
fc9d39c3a6 change ctdb_validate_vnn to ctdb_validate_pnn
(This used to be ctdb commit a4a1f41b69475b9dc16d8fd7f8965c32e96c32f0)
2007-09-04 10:09:58 +10:00
Ronnie Sahlberg
eb4cf6a686 change ctdb->vnn to ctdb->pnn
(This used to be ctdb commit 8c776e5707e503ec6586aae39ac6b3ea5a2fd2bc)
2007-09-04 10:06:36 +10:00
Ronnie Sahlberg
12ebb74838 change how we do public addresses and takeover so that we can have
multiple public addresses spread across multiple interfaces on each 
node.

this is a massive patch since we have previously made the assumtion that 
we only have one public address per node.

get rid of the public_interface argument.  the public addresses file 
now explicitely lists which interface the address belongs to

(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
2007-09-04 09:50:07 +10:00
Ronnie Sahlberg
50c09b7465 when we receive a packet from the network, check explicitely that the
node is not banned it the call is for a database record. i.e a REQ/REPLY 
CALL/DMASTER

if we get such a call while banned, ignore the packet and write an entry 
in the logfile

(This used to be ctdb commit 79eb0863609fbb12e28ebf734101b1d3f359b330)
2007-08-22 12:53:24 +10:00
Ronnie Sahlberg
f6e0336b23 create a define to represent the 'invalid' generation id we used in two
places.

create a new helper function to generate new generation id values that 
know about the invalid id and avoids generating it.

update the ctdb status tool to know about the invalid generation id and 
print the string INVALID instead

(This used to be ctdb commit 4fbcd189543cb8a92227fdcd3d158472e558ccda)
2007-08-22 12:38:31 +10:00
Ronnie Sahlberg
2eef287fab print the operation code in the debug message when we discard a packet
due to incorrect generation number

(This used to be ctdb commit 3151e3b2607291572fc6e7380fd60ef7ce438307)
2007-07-11 08:41:29 +10:00
Andrew Tridgell
32de198fd3 update lib/replace from samba4
(This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)
2007-07-10 15:29:31 +10:00
Andrew Tridgell
6399cf9542 added code to kill registered clients on a IP release
(This used to be ctdb commit ca0243b544987ce0618a99ac87b4abf598991e93)
2007-06-19 03:54:06 +10:00
Andrew Tridgell
18ae6e56f0 propogate flag changes to all connected nodes
(This used to be ctdb commit 711d1f7e20f1e98caaf08a57df0b1825ff6e97a0)
2007-06-09 21:58:50 +10:00
Andrew Tridgell
b50096c835 more code rearrangement
(This used to be ctdb commit 2bcf3b16163041f03add2e5bf9f1f5fb3599ec24)
2007-06-07 22:16:48 +10:00