1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-25 06:04:04 +03:00

1527 Commits

Author SHA1 Message Date
Ronnie Sahlberg
52f03be2d9 From Chris Cowan, patch to make aix compile again
(This used to be ctdb commit 77255bb5523b8d132770a0a7d4ba29ec9e5043cc)
2008-07-09 10:17:39 +10:00
Ronnie Sahlberg
2d644b3fbe Replace \s with [[:space:]] in our regexps we use for egrep.
Kevin Collins noticed that RHEL5 grep-2.5.1-54.2.el5 built for
x86 does not handle \s    while the exact same RHEL5 package for amd64
does!

[[:space:]] is more portable.  Even across the same package version ( different architecture ) from the same vendor :-)

(This used to be ctdb commit fd7bb21c4f9289fc34a57f9d8cb7c13a02d06096)
2008-07-09 10:03:21 +10:00
Ronnie Sahlberg
522830dea8 Revert "waitpid() can block if it takes a long time before the child terminates"
This reverts commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10.

revert the waitpid changes.   we need to waitpid for some childredn so should
refactor the approach completely

(This used to be ctdb commit 702ced6c2fe569c01fe96c60d0f35a7e61506a96)
2008-07-08 17:41:31 +10:00
Ronnie Sahlberg
79425ddec5 Revert "set sigchild to SIG_IGN instead of SIG_DFL"
This reverts commit b1f1e80d3ad50280a300f2ed021513cf0a6f3a76.

(This used to be ctdb commit 2030e9ff2ca044181b72c3b87d513bf27057b5a2)
2008-07-08 17:40:53 +10:00
Ronnie Sahlberg
71d2315eee set sigchild to SIG_IGN instead of SIG_DFL
(This used to be ctdb commit b1f1e80d3ad50280a300f2ed021513cf0a6f3a76)
2008-07-08 16:31:23 +10:00
Ronnie Sahlberg
f1c4041c84 new version 1.0.45
(This used to be ctdb commit b4b2408ba1bdce22abb3fb19d398b72e96da6505)
2008-07-08 10:03:57 +10:00
Ronnie Sahlberg
5ab7eaa553 update the monitor event for nfs to track how many times in a row it has failed
to "ping" the local nfs daemon.

Once it has failed more than 3 times in a row it will attempt to restart the nfs service.

(This used to be ctdb commit a4e89f57a8d733ea74df7b0de31eb977d6d37388)
2008-07-08 09:58:10 +10:00
Ronnie Sahlberg
d67de4a7d2 waitpid() can block if it takes a long time before the child terminates
so we should not call it from the main daemon.

1, set SIGCHLD to SIG_DFL to make sure we ignore this signal

2, get rid of all waitpid() calls

3, change reporting of event script status code from _exit()/waitpid()   to write()/read() one byte across the pipe.

(This used to be ctdb commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10)
2008-07-08 03:48:11 +10:00
Ronnie Sahlberg
6bfbec28a4 use more libral handling of event scripts timing out.
If the event script that timed out was for the "monitor" event, then
even if it timed out we still return SUCCESS back to the guy invoking the eventscript.
Only consider the eventscript for "monitor" to have failed with an error
IFF it actually terminated with an error,   or if it timed out 5 times in a row and hung.

(This used to be ctdb commit 60f3c04bd8b20ecbe937ffed08875cdc6898b422)
2008-07-07 20:38:59 +10:00
Ronnie Sahlberg
6eff9289d7 new version .44
(This used to be ctdb commit 6043f926f89b361c7fe14fc60d2769fd2ba63dfc)
2008-07-07 09:07:49 +10:00
Ronnie Sahlberg
811493a0b6 zero out the sockaddr_in structure before we store the ipv4 data in it to make sure that all data is initialized. Othervise valgrind will complain about uninitialized data when we write this structure out on the wire
(This used to be ctdb commit 80e249512f93bca2445d40590db38d31be2aafd7)
2008-07-07 08:53:22 +10:00
Ronnie Sahlberg
2003196816 we need a 'case x:' in our ugly 'encode the control opcode as a linenumber in valgrind output' hack to make it work
(This used to be ctdb commit f4929e164be1703f74fc332e740b85cfe1ae3e73)
2008-07-07 08:52:04 +10:00
Ronnie Sahlberg
64e02585e7 If a transaction commit fails. Log this error and cancel all pending transactions to the
databases instead of calling ctdb_fatal()

(This used to be ctdb commit ff2985aaef999d180277db4cf644fee0ea79c14d)
2008-07-07 08:51:05 +10:00
Ronnie Sahlberg
f25fd04f73 in the destructor for the lock-wait child, make sure that we cancel any pending
transactions.

(This used to be ctdb commit 45b6ff64f6ddf037b810c4e5f8b9f04d71067b98)
2008-07-07 08:50:12 +10:00
Andrew Tridgell
9999f18369 an extraordinarily ugly patch!
This is a hack to allow backtraces under valgrind to show what opcode
is getting uninitialised bytes

(This used to be ctdb commit 67bb12c8f0af5914efb44b76bc6ddbb11fc0fcdf)
2008-07-04 18:00:24 +10:00
Andrew Tridgell
30f8411eb9 ensure pad bytes in the ltdb_header are initialised
(This used to be ctdb commit 00b1a635e3d61ca7c5487d65ac54f3eb6ea7355e)
2008-07-04 17:40:25 +10:00
Andrew Tridgell
50cd520c6a don't use mmap in tdb if --nosetsched is set. That makes valgrind
happier (it doesn't like the mmap/msync calls in tdb)

(This used to be ctdb commit f3a729998ce67f5d2e3b2ad41d96e8f04c0d18d8)
2008-07-04 17:32:21 +10:00
Andrew Tridgell
e7ac67ccc6 prevent valgrind errors where we print unitialised values on control errors
(This used to be ctdb commit ababd8aba2f9c13aaa1b623b8a76c2f98bb94dd4)
2008-07-04 17:15:06 +10:00
Andrew Tridgell
b3bcb42774 fixed a warning
(This used to be ctdb commit 015cd221c3c62eaa3cd0351fb8e93292c7c293aa)
2008-07-04 17:04:37 +10:00
Andrew Tridgell
60e5d83cb0 fixed some incorrect CTDB_NO_MEMORY*() calls found after fixing the
_VOID varient

(This used to be ctdb commit 07c9133aedecaee3607ad3b6fa94e5c56417a9de)
2008-07-04 17:04:26 +10:00
Andrew Tridgell
8be67e0e09 CTDB_NO_MEMORY_VOID() needs to return on error
(This used to be ctdb commit 6d21fd57bedffce2298ce7fe4c7d889c858ba7fa)
2008-07-04 16:58:29 +10:00
Andrew Tridgell
75b8cd1096 added option to start ctdb under valgrind
Just add CTDB_VALGRIND=yes in /etc/sysconfig/ctdb, and look at the
logs in /var/log/ctdb_valgrind.*

(This used to be ctdb commit 9acd577c97059e8924582ac52e9ce5785903f120)
2008-07-04 16:58:14 +10:00
Andrew Tridgell
07e145316c zero out the ctdb->freeze_handle when we free it
This prevents heap corruption when a freeze child dies

(This used to be ctdb commit 4edc6d40cb63936146af99030b7819683238abfc)
2008-07-04 16:05:04 +10:00
Ronnie Sahlberg
64c4639ce9 we dont need to explicitely thaw the databases from the recovery daemon
since this is already done implicitely when we changed recovery mode
back to normal

(This used to be ctdb commit af1f6cf7561fe9cb5c97f940d4458c83bdd8e2a0)
2008-07-03 12:46:09 +10:00
Ronnie Sahlberg
ef769e7237 track both when we last started and ended a recovery.
make ctdb uptime print how long the recovery took

in the recovery daemon when we check that the public ip address
allocation on the local node is correct (we have the ips we should have
and we dont have any we shouldnt have) use ctdb uptime and check the
recovery start/stop times and make sure we dont check for ip allocation
inconsistencies during a recovery  where the ip address allocation is in flux.

(This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429)
2008-07-02 13:55:59 +10:00
Ronnie Sahlberg
05b50ebe0a print the opcode when an async callback detects an error
(This used to be ctdb commit 423934629704683d3a3042570577fb4e04b17a6d)
2008-07-02 12:21:53 +10:00
Ronnie Sahlberg
bb2019bb0f update a comment to reflect that this is not always a real recovery
it can also be printed when we just do an ip reallocation

(This used to be ctdb commit e4c9e511fc5e15e0638ebb9117cb4a65ca8fda4b)
2008-07-02 12:01:19 +10:00
Ronnie Sahlberg
e75c7907fe new version
(This used to be ctdb commit af5d9435822917d36929e667063db69e6a426d3d)
2008-07-01 09:34:43 +10:00
Ronnie Sahlberg
ff4af5ff9f initdit/ctdb is not a config file
(This used to be ctdb commit 9f501cee9132114e7467a33dab5cfe0737f94f44)
2008-06-27 09:31:18 +10:00
Ronnie Sahlberg
03cbb27a79 make /etc/ctdb/functions executable and add a hashbang to it so
rpmlint wont complain

(This used to be ctdb commit 9b8179ad043a80e0e18eeba427a7b7b15690d039)
2008-06-27 09:29:38 +10:00
Ronnie Sahlberg
1ccc4a8e2b test
(This used to be ctdb commit 4f2d722cf29175c3c207e6ebb6d4f9e370767249)
2008-06-26 14:14:37 +10:00
Ronnie Sahlberg
f1b3ddc357 Revert "test"
This reverts commit f71287a28d66db202fe52f9a43b6daf2389d7f66.

(This used to be ctdb commit a928857e38d645baca62cea7f7367488d140dca7)
2008-06-26 14:00:36 +10:00
Ronnie Sahlberg
2cffc2e9c6 test
(This used to be ctdb commit f71287a28d66db202fe52f9a43b6daf2389d7f66)
2008-06-26 13:51:18 +10:00
Ronnie Sahlberg
c5de452dca reduce loglevel of the info message we are updating the flags on all nodes
(This used to be ctdb commit 9a98a21979558dcd6421b3fcb97d21ab82b792d8)
2008-06-26 13:15:41 +10:00
Ronnie Sahlberg
c5e7e0b2fd force an update of the flags from the recmaster after each monitoring run
(This used to be ctdb commit 251aeadc8b16a9c27a4bae78c97ad6e93e6cfdf4)
2008-06-26 13:08:37 +10:00
Ronnie Sahlberg
854a615de9 /etc/ctdb/functions should not be executable
(This used to be ctdb commit d481f0f3d11e66d259cbc84f34cb6ae27d09e42c)
2008-06-26 12:43:30 +10:00
Ronnie Sahlberg
cfc0af79ce third attempt for fixing a freeze child writing to the socket
(This used to be ctdb commit b8c8c5cb351747863c5d1366b57c96122ade5db0)
2008-06-26 11:52:26 +10:00
Ronnie Sahlberg
97f8bf16c5 verify that the recmaster has the correct flags for us and if not tell the recmaster what the flags should be
(This used to be ctdb commit 3387597926ad71e4140cc504b828486d99a3ec8e)
2008-06-26 11:08:09 +10:00
Ronnie Sahlberg
2910ea1606 only loop over the write it the write failed
(This used to be ctdb commit b99d687894cb69d863345713055d9c8dc1b29194)
2008-06-26 11:02:08 +10:00
Ronnie Sahlberg
77ef05e95b the write() from the freeze child process can fail
try writing many times and log an error if the write failed

(This used to be ctdb commit f15b224e42e81cda84b98f01f919d463e80fb89f)
2008-06-26 09:54:27 +10:00
Ronnie Sahlberg
aa82d15e75 it is 2008 not 2008 right now :-)
(This used to be ctdb commit 8734bd32809ad817ad28d96315a139674429c395)
2008-06-13 13:53:05 +10:00
Ronnie Sahlberg
79caa61de8 update to 1.0.42
(This used to be ctdb commit de8f1bedc56da05c03cfd0e4780839771d94a58f)
2008-06-13 13:50:28 +10:00
Ronnie Sahlberg
fd921aea28 ban the node after 3 failed scripts by default
(This used to be ctdb commit b4e6d8e37c7f985f357af82b4a524959bb97ec4c)
2008-06-13 13:45:23 +10:00
Ronnie Sahlberg
779468ab3f if the event scripts hangs EventScriptsBanCount consecutive times in a row
the node will ban itself for the default recovery ban period

(This used to be ctdb commit 7239d7ecd54037b11eddf47328a3129d281e7d4a)
2008-06-13 13:18:06 +10:00
Ronnie Sahlberg
30535c815d when a eventscript has timed out, log the event options (i.e. "monitor" "takeip 1.2..." etc)
to the log

(This used to be ctdb commit dbe31581abf35fc4a32d3cbf487dd34e2b9c937a)
2008-06-13 12:18:00 +10:00
Ronnie Sahlberg
e6d1d766c5 make it possible to re-start a recovery without marking the current node as
the culprit.

(This used to be ctdb commit 3a69fad0b1dee4a482461680c556358409e53c4d)
2008-06-13 11:47:42 +10:00
Ronnie Sahlberg
4b6b094860 add a callback for failed nodes to the async control helper.
this callback is called for every node where the control failed (or timed out)

when we issue the start recovery control from recovery master,
set any node that fails as a culprit   so it will eventually be banned

(This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2)
2008-06-12 16:53:36 +10:00
Ronnie Sahlberg
d8433cacb2 first cut to convert takeover_callback_state{}
to use ctdb_sock_addr instead of sockaddr_in

(This used to be ctdb commit 5444ebd0815e335a75ef4857546e23f490a22338)
2008-06-04 17:12:57 +10:00
Ronnie Sahlberg
598fba7fad fix a comment
note that we dont actually send the ipv6 "gratious arp" on the wire just yet.
(since ipv6 doesnt use arp)
but all the infrastructure is there when we implement sending raw neig.disc. packets

(This used to be ctdb commit b87fab857bc9b3537527be93b7f68484502d6b84)
2008-06-04 15:23:06 +10:00
Ronnie Sahlberg
7d39ac131b convert handling of gratious arps and their controls and helpers to
use the ctdb_sock_addr structure so tehy work for both ipv4 and ipv6

(This used to be ctdb commit 86d6f53512d358ff68b58dac737ffa7576c3cce6)
2008-06-04 15:13:00 +10:00