1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-24 21:34:56 +03:00
Commit Graph

146 Commits

Author SHA1 Message Date
Ronnie Sahlberg
e5e2f6f8f7 increase the listen queue. Now that the eventscripts may become clients and connect back to the server we do get a lot more concurrent connection attempts (takepip/teleaseip are performed in parallell)
(This used to be ctdb commit 018f8b0b1823ef59b46f1a671aec5309d10628f4)
2009-04-06 14:00:41 +10:00
Ronnie Sahlberg
94a56ea410 reqrite the handling of flag updates across the cluster to eliminate a
race between the ctdb tool and the recovery daemon both at once
trying to push flag changes across the cluster.

(This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa)
2008-11-20 12:43:18 +11:00
Ronnie Sahlberg
06728fdac9 we actually need a ctdb_db variable
(This used to be ctdb commit aba984f1b85f5a2d370b093061cf15843ee53758)
2008-11-03 21:54:52 +11:00
Ronnie Sahlberg
d7007793ea latency is measured in us, not ms
use an explicit ctdb_db variable instead of dereferencing state

(This used to be ctdb commit 8c6a02fb423a8cbcbfc706767e3d353cd48073c3)
2008-10-30 13:34:10 +11:00
Ronnie Sahlberg
e1b0cea427 add control and logging of very high latencies.
log the type of operation and the database name for all latencies higher
than a treshold

(This used to be ctdb commit 1d581dcd507e8e13d7ae085ff4d6a9f3e2aaeba5)
2008-10-30 12:49:53 +11:00
Ronnie Sahlberg
6474f3278d additional monitoring between the two daemons.
we currently only monitor that the dameons are running by kill(0, pid)
and verifying the the domain socket between them is ok.

this is not sufficient since we can have a situation where the recovery
daemon is hung.

this new code monitors that the recovery daemon is operating.
if the recovery hangs, we log this and shut down the main daemon

(This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c)
2008-09-09 13:44:46 +10:00
Ronnie Sahlberg
ef997d344f initial ipv6 patch
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>

(This used to be ctdb commit 1f131f21386f428bbbbb29098d56c2f64596583b)
2008-08-19 14:58:29 +10:00
Ronnie Sahlberg
8b520bcb5f lower a debug message
(This used to be ctdb commit 554dcf16d37c8b9e4704df11d21fb272f30f5cec)
2008-07-18 10:38:51 +10:00
Ronnie Sahlberg
6eb4e46fe1 Add two new controls to start and cancel a persistent update.
This allows ctdb to automatically start a new full blown recovery
if a client has started updating the local tdb for a persistent database
but is kill -9ed before it has ensured the update is distributed clusterwide.

(This used to be ctdb commit 1ffccb3e0b3b5bd376c5302304029af393709518)
2008-07-17 13:50:55 +10:00
Ronnie Sahlberg
334db8ccba proper waitpid() fix.
remove all waitpid() calls and use the event system to trap sigchld

(This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358)
2008-07-09 14:02:54 +10:00
Ronnie Sahlberg
522830dea8 Revert "waitpid() can block if it takes a long time before the child terminates"
This reverts commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10.

revert the waitpid changes.   we need to waitpid for some childredn so should
refactor the approach completely

(This used to be ctdb commit 702ced6c2fe569c01fe96c60d0f35a7e61506a96)
2008-07-08 17:41:31 +10:00
Ronnie Sahlberg
79425ddec5 Revert "set sigchild to SIG_IGN instead of SIG_DFL"
This reverts commit b1f1e80d3ad50280a300f2ed021513cf0a6f3a76.

(This used to be ctdb commit 2030e9ff2ca044181b72c3b87d513bf27057b5a2)
2008-07-08 17:40:53 +10:00
Ronnie Sahlberg
71d2315eee set sigchild to SIG_IGN instead of SIG_DFL
(This used to be ctdb commit b1f1e80d3ad50280a300f2ed021513cf0a6f3a76)
2008-07-08 16:31:23 +10:00
Ronnie Sahlberg
d67de4a7d2 waitpid() can block if it takes a long time before the child terminates
so we should not call it from the main daemon.

1, set SIGCHLD to SIG_DFL to make sure we ignore this signal

2, get rid of all waitpid() calls

3, change reporting of event script status code from _exit()/waitpid()   to write()/read() one byte across the pipe.

(This used to be ctdb commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10)
2008-07-08 03:48:11 +10:00
Ronnie Sahlberg
adf40341a7 ctdb->methods becomes NULL when we shutdown the transport.
If we shutdown the transport   and CTDB later decides to send a command out
for queueing, the call to ctdb->methods->allocate_pkt() will SEGV.

This could trigger for example when we are in the process of shuttind down CTDBD and have already shutdown the transport but we are still waiting for the
"shutdown" eventscripts to finish.
If the event scripts now take much much longer to execute for some reason, this
race condition becomes much more probable.

Decorate all dereferencing of ctdb->methods->    with a check that ctdb->menthods is non-NULL

(This used to be ctdb commit c4c2c53918da6fb566d6e9cbd6b02e61ae2921e7)
2008-05-11 14:28:33 +10:00
Ronnie Sahlberg
cd1858d126 fix compiler warning during a fatal error failing to lock down the socket
(This used to be ctdb commit 0ad22de1a614dc2d1926546027be5f5eea3381ed)
2008-04-10 09:56:49 +10:00
Ronnie Sahlberg
2da3fe1b17 From Chris Cowan
secure the domain socket and set permissions properly

(This used to be ctdb commit ac6a362fc2fc4a56b4c310478a96eb12daace176)
2008-04-10 06:51:53 +10:00
Ronnie Sahlberg
6b797f148c From Chris Cowan
Add support in AIX to track the PID of a client that connects to the unix domain socket

(This used to be ctdb commit 4c006c675d577d4a45f4db2929af6d50bc28dd9e)
2008-04-03 10:58:51 +11:00
Ronnie Sahlberg
03d30f405d decorate the memdump output with a nice field for ctdb_client structures to show the pid of the client that attached
(This used to be ctdb commit 0d9314302d0b988b6ab5d533deef40c5b343c249)
2008-04-01 17:17:21 +11:00
Ronnie Sahlberg
27a7f854f5 add improvements to tracking memory usage in ctdbd adn the recovery daemon
and a ctdb command to pull the talloc memory map from a recovery daemon
ctdb rddumpmemory

(This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05)
2008-04-01 15:34:54 +11:00
Andrew Tridgell
f6e53f433b merge from ronnie
(This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)
2008-02-04 20:07:15 +11:00
Andrew Tridgell
9d6ac0cf55 added debug constants to allow for better mapping to syslog levels
(This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502)
2008-02-04 17:44:24 +11:00
Andrew Tridgell
b62b7fcde8 added syslog support, and use a pipe to catch logging from child processes to the ctdbd logging functions
(This used to be ctdb commit 1306b04cd01e996fd1aa1159a9521f2ff7b06165)
2008-01-16 22:03:01 +11:00
Andrew Tridgell
bf9e33d4cf - catch a case where the client disconnects during a call
- track all talloc memory, using NULL context

(This used to be ctdb commit bf89c56002f5311520e91cb367753bc46e5dddc9)
2008-01-16 09:44:48 +11:00
Ronnie Sahlberg
ba31feaec0 split node health monitoring and checking for connected/disconnected
nodes into two separate files.

move the monitoring of keepalives for detecting connected/disconnected 
remote nodes into ctdb_keepalive.c

(This used to be ctdb commit 23a57b20c314d5f11a433cf251eb9d9de743849a)
2008-01-15 08:42:12 +11:00
Andrew Tridgell
9311f7fb7e fixed the bug that make "onnode N service ctdb start" hang
(This used to be ctdb commit b50dcb16f30a60abce42f491f9b0aae7948b8206)
2008-01-05 12:09:29 +11:00
Andrew Tridgell
bde886988b prevent a deadly embrace between smbd and ctdbd by moving the calling
of the startup event scripts after the point where recovery has
started and the node is in normal operation

This makes the 'startup' script just a special type of the 'monitor'
script which is called first

(This used to be ctdb commit 7424c30a5fd04aea0137c466b4318c3f185280d8)
2007-11-12 10:53:11 +11:00
Andrew Tridgell
b87ddd9148 no longer wait at startup for services to become available, instead
set the node initially unhealthy and let the status monitoring bring the node online.
This fixes a problem with winbindd, where it refused to start because secrets.tdb was not populated
but we could not populate ctdbd, because the net command would not run while ctdbd was still doing startup
and thus frozen
(This used to be ctdb commit 3a001b793dd76fb96addf1e2ccb74da326fbcfbc)
2007-09-24 10:00:14 +10:00
Andrew Tridgell
c60988325d added support for persistent databases in ctdbd
(This used to be ctdb commit 3115090a0d882beca9d70761130b74bb0821f201)
2007-09-21 12:24:02 +10:00
Andrew Tridgell
a478c78f03 changed some debug levels
(This used to be ctdb commit ed764533e1c2f8982e1577ca5e7f5f4482a15345)
2007-09-12 13:21:19 +10:00
Ronnie Sahlberg
d66d9cdd22 change debug output from vnn to pnn
change ctdb_daemon_send_message to take pnn as parameter isntead of vnn

(This used to be ctdb commit e352a2bbf9bb9a0b2c4f8329e8a529cf02414097)
2007-09-04 10:45:41 +10:00
Ronnie Sahlberg
211b497818 change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn
change ctdb_ban_info.vnn to ctdb_ban_info.pnn

(This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a)
2007-09-04 10:33:10 +10:00
Ronnie Sahlberg
583b6e6ba6 change ctdb_get_vnn to ctdb_get_pnn
(This used to be ctdb commit 1e19930198c2bcc7ccb755e0ee51555fb823029a)
2007-09-04 10:18:44 +10:00
Ronnie Sahlberg
fc9d39c3a6 change ctdb_validate_vnn to ctdb_validate_pnn
(This used to be ctdb commit a4a1f41b69475b9dc16d8fd7f8965c32e96c32f0)
2007-09-04 10:09:58 +10:00
Ronnie Sahlberg
eb4cf6a686 change ctdb->vnn to ctdb->pnn
(This used to be ctdb commit 8c776e5707e503ec6586aae39ac6b3ea5a2fd2bc)
2007-09-04 10:06:36 +10:00
Ronnie Sahlberg
8b06fc7284 change the structure used for node flag change messages so that we can
see both the old flags as well as the new flags (so we can tell which 
flags changed)

send the CTDB_SRVID_RECONFIGURE messages to connected nodes only, not to 
every node, connected or not, in the cluster.


in the handler inside the recovery daemon which is invoked for node flag 
change messages, only do a takeover_run() and redistribute the ip addresses IF it was the 
disabled or the unhealthy flags that changed. Also send out the cluster 
reconfigured message in this case.
If any of the other flags changed we dont need to do the takeover_run(0 
here since that will be done during recovery.

(This used to be ctdb commit 5549b2058e2c148a8ca9d419123acf3247bb8829)
2007-08-21 17:25:15 +10:00
Ronnie Sahlberg
5228abef64 add an atexit() that will print "CTDB daemon shutting down" in the log
when the main daemon exits

(This used to be ctdb commit f7422397be2e319bfbee5bf0670583c353eda86d)
2007-08-21 09:43:53 +10:00
Ronnie Sahlberg
aed2c58c64 dont pollute the log with 'Registered PID XXX for client YYY' at log
level 0.

change the log level to 3 for this information message

(This used to be ctdb commit f28d713d9cacd2312932b51175aa8402c96ef76b)
2007-08-21 08:42:42 +10:00
Ronnie Sahlberg
fca90ce3c3 updated ctdb tickle management
there is an array for each node/public address that contains tcp tickles

we send a TCP_ADD as a broadcast to all nodes when a client is added

if tcp tickles are removed, they are only removed immediately from the 
local node.
once every 20 seconds a node will push/broadcast out the tickle list for 
all public addresses it manages.   this will remove any deleted tickles 
from the remote nodes

(This used to be ctdb commit e3c432a915222e1392d91835bc7a73a96ab61ac9)
2007-07-20 15:05:55 +10:00
Andrew Tridgell
d2a5af7eb8 fully save/restore scheduler parameters
(This used to be ctdb commit 59408eabe7515d49a6eef3b6fb2590a1cd1df956)
2007-07-13 09:35:46 +10:00
Andrew Tridgell
fc73bc5c24 added --nosetsched option to ctdbd
(This used to be ctdb commit 4cbbb88c1735c7d112e751e22da1c1c69e09bf4a)
2007-07-13 08:47:02 +10:00
Andrew Tridgell
32de198fd3 update lib/replace from samba4
(This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)
2007-07-10 15:29:31 +10:00
Andrew Tridgell
6399cf9542 added code to kill registered clients on a IP release
(This used to be ctdb commit ca0243b544987ce0618a99ac87b4abf598991e93)
2007-06-19 03:54:06 +10:00
Andrew Tridgell
97d5bea2eb on startup release all IPs, in case we have any left over from a previous run
(This used to be ctdb commit 5eb2f8f5f70f567c264d6929e95899b70f0e4ec0)
2007-06-12 19:44:54 +10:00
Andrew Tridgell
044a2e04c4 - send tcp info to all connected nodes, not just vnnmap nodes
- use a non-blocking freeze when banned
- release all IPs when banned

(This used to be ctdb commit 070e85e532b33b792f85c3e72eee205d906aaf85)
2007-06-10 08:46:33 +10:00
Andrew Tridgell
ae3d54094b start splitting the code into separate client and server pieces
(This used to be ctdb commit 603cd77988c181525946cd5eb0f4d0d646b58059)
2007-06-07 22:06:19 +10:00