1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-24 21:34:56 +03:00
Commit Graph

1397 Commits

Author SHA1 Message Date
Andrew Tridgell
43aa27c9ee this is needed with merged tdb
(This used to be ctdb commit 3dc07f2bf98ab445ab960ef14173bc6924e3b658)
2008-01-05 17:42:01 +11:00
Andrew Tridgell
841a04924c merge from Samba4
(This used to be ctdb commit 9aed7a1d065272c2e5b54872228a73f37664b526)
2008-01-05 17:41:41 +11:00
Andrew Tridgell
370779a1bb update from Samba4
(This used to be ctdb commit 298118c41bd33acd1a34a35a71a28451a45390c5)
2008-01-05 17:41:01 +11:00
Andrew Tridgell
67d2b14d90 convert tdb from u32 to uint32_t to match the current Samba trees
(This used to be ctdb commit 0dc754b7e8b0985a252885ed043949dfb7ea1ae1)
2008-01-05 17:22:47 +11:00
Andrew Tridgell
c2f84d6f4e Rewrote the tdb transaction code to be O(N) instead of O(N^2)
The previous transaction code was fast as long as you didn't do too
many writes within the transaction. The new code is a bit slower for
very small numbers of writes, but scales linearly as the number of
writes increases. The old code scaled as O(N^2) with the number of
writes, making it unusable for large N.

After testing, this needs to be merged into the Samba version of tdb,
along with many of the other recent tdb changes in the ctdb tree.

(This used to be ctdb commit bef8fe3d3ba80c7c660972c5357407f5278f7e26)
2008-01-05 17:19:47 +11:00
Andrew Tridgell
5180a3b756 fixed excludes in tar ball creation for src rpm
(This used to be ctdb commit fe0662fb2cdf733c5c9da7e24641e89039cb3e54)
2008-01-05 13:08:10 +11:00
Andrew Tridgell
a54b88dba2 fixed data offset definition
(This used to be ctdb commit cef83d74883f6c66866fb7e5e17769322a3473da)
2008-01-05 12:10:18 +11:00
Andrew Tridgell
9311f7fb7e fixed the bug that make "onnode N service ctdb start" hang
(This used to be ctdb commit b50dcb16f30a60abce42f491f9b0aae7948b8206)
2008-01-05 12:09:29 +11:00
Andrew Tridgell
63b2d1c34e cleanup the new freelist code
(This used to be ctdb commit 76137104c7028b061578950d4b6b35ca8267fab1)
2008-01-05 12:09:00 +11:00
Andrew Tridgell
a21afe88bc added tdb_wipe_all() function
(This used to be ctdb commit 8e2d81cf54630970d66af92de2c0333acd2e1d22)
2008-01-05 12:08:41 +11:00
Andrew Tridgell
9bd69e75c8 ensure we always build the right version
(This used to be ctdb commit 841943e74355a4347cda5b23b1807522bf12f169)
2008-01-05 09:55:18 +11:00
Andrew Tridgell
02287e5781 update version
(This used to be ctdb commit 37a1d17365995f15696e4c338d7c2efcc04c1e6e)
2008-01-05 09:52:53 +11:00
Andrew Tridgell
e4aefbc66d a new tunable DatabaseMaxDead that enables the tdb max dead cache logic
(This used to be ctdb commit 01c519c3658a8fcb9545b507b597e723658e4c4e)
2008-01-05 09:36:53 +11:00
Andrew Tridgell
023a230d9c a useful hack for checking correct behaviour of recovery
(This used to be ctdb commit d88b95a5407b53ead47ca0638ee60653ea3d3d07)
2008-01-05 09:36:21 +11:00
Andrew Tridgell
f79dfd04c0 convert much of the recovery logic to be async and parallel across all nodes
(This used to be ctdb commit 8b72a02bf1045d8befb342a4111ca1316889262e)
2008-01-05 09:35:43 +11:00
Andrew Tridgell
9a625534c1 this fixes the non-dmaster bug that has plagued us for months
(This used to be ctdb commit 2acf6c6201862debfca054a09262f75c066d2deb)
2008-01-05 09:34:47 +11:00
Andrew Tridgell
69fb0d3874 avoid write locks during delete checks in traversals
(This used to be ctdb commit dde9f3f0061988a0cdf10ee9e4db982c1b79ad1a)
2008-01-05 09:33:39 +11:00
Andrew Tridgell
fc21f78231 make some specific cases of the non-dmaster bug non-fatal
(This used to be ctdb commit 7b516ab06c7ba7ffe9ecf3f76720df5360176b2c)
2008-01-05 09:32:29 +11:00
Andrew Tridgell
c4826c203a added async pull, push and rsn handling functions
(This used to be ctdb commit 05d30180f64aaff13411b92586ac554d84a35d9a)
2008-01-05 09:31:43 +11:00
Andrew Tridgell
e9987cf236 fixed a warning
(This used to be ctdb commit f34d0f9351c1cda3327efb14e173f249f7854570)
2008-01-05 09:30:49 +11:00
Andrew Tridgell
9ea20f3916 expand tdb by minimum of 25% at a time
(This used to be ctdb commit 355575878e2b6e85268ca8387f41a19bcd9db651)
2008-01-05 09:30:09 +11:00
Andrew Tridgell
8a10bc6561 update revnumber for custom tree
(This used to be ctdb commit 8b36f4a4f131102769782fe2994b846fc2b8da13)
2008-01-04 12:42:29 +11:00
Andrew Tridgell
afc7275c16 fixed a warning
(This used to be ctdb commit d6255438d63943736b24a7a6da190b6933379a61)
2008-01-04 12:42:10 +11:00
Andrew Tridgell
b4a5c5e988 make sure vars are set at startup before recovery
(This used to be ctdb commit 2c789f19b069c975c133dd8488b566a6715a8e76)
2008-01-04 12:41:53 +11:00
Andrew Tridgell
ea13223fbb prevent O(n^2) behaviour for traverse after large numbers of deletes
(This used to be ctdb commit e3c60552366f1d8d464c43efbcd6ed5a2a1adb71)
2008-01-04 12:12:02 +11:00
Andrew Tridgell
2509821503 prevent a re-ban loop for single node clusters
(This used to be ctdb commit b20a3369655bcba274c99091157ba7466994e848)
2008-01-04 12:11:29 +11:00
Andrew Tridgell
10bc8a4c92 added ctdb_randrec test tool
(This used to be ctdb commit be59e7f3db992667664a631433b99ff19f4313f0)
2008-01-04 09:41:04 +11:00
Andrew Tridgell
41fb8e283b add randrec to Makefile
(This used to be ctdb commit ded1f7903e8a6525ab1888e8c4f50c71fa23cc19)
2008-01-04 09:19:06 +11:00
Andrew Tridgell
bb06e831a0 more optimisations to recovery
(This used to be ctdb commit 9a41ad0a842cd4f3792d6e84b5c809b7ff6f342e)
2008-01-02 22:44:46 +11:00
Andrew Tridgell
1ebc6307f1 make this a custom build
(This used to be ctdb commit 570ad64dadde2056e41d95d1278cf030949855d5)
2008-01-02 12:06:55 +11:00
Andrew Tridgell
77c376eed9 make this a custom build
(This used to be ctdb commit cc805ec72c9b0e60e06b5b920fbb5fe67b266c2a)
2008-01-02 12:06:19 +11:00
Andrew Tridgell
fa965dee8f quick fix for timeout in recovery
(This used to be ctdb commit 9205c681a819782d061bb41637191c130e91b100)
2008-01-02 12:04:07 +11:00
Andrew Tridgell
ec5995221f fixed order of changelog
(This used to be ctdb commit 05940cc8a7c7e75b976f2f0151d03fdf63c59395)
2007-12-27 10:19:09 +11:00
Andrew Tridgell
d116235197 updated release info
(This used to be ctdb commit 657aac41b2c2f7e4d53e5709d4eb8dbd9c5f5616)
2007-12-27 10:13:54 +11:00
Andrew Tridgell
2a2f1e3d91 fixed segv on failed ctdb_ctrl_getnodemap
(This used to be ctdb commit 5daf9a72f0e60a9af7cf32ae6d759be7d94857ec)
2007-12-27 10:07:01 +11:00
Andrew Tridgell
36fd19774a update release number and changelog
(This used to be ctdb commit fa9723e1e43cdbb5c0c3c31fd79c50aa1298ba3d)
2007-12-04 15:50:43 +11:00
Andrew Tridgell
6ef3bff4ed merge from ronnie
(This used to be ctdb commit 072ef744951d3aa59dd8be70578b99b18c37d988)
2007-12-04 15:20:40 +11:00
Andrew Tridgell
a55c3709ea make DeterministicIPs the default
(This used to be ctdb commit e7d077e98a40a62dbd6bfd174f29afba7b5529ef)
2007-12-04 15:18:27 +11:00
Ronnie Sahlberg
7cef33b40a rework banning/unbanning nodes
ctdb_recoverd.c
Always handle banning/unbanning locally on the node that is being 
banned/unbanned instead of on the recovery master.
This means that if a ban request comes in to the recovery master for a 
remote node, we pass the request on to the remote node instead of 
setting up the ban and ban timeouts locally.

ctdb.c
send ban/unban requests to the node being banned/unbanned instead of to 
the recmaster

(This used to be ctdb commit 880dd9f5fd0b91e450da93e195cc5c62cb1dcd6e)
2007-12-03 15:45:53 +11:00
Ronnie Sahlberg
64008e28bb for the banned status, we should allocate this structure as a child of
the banned_nodes array and not the rec structure so that  ban_state is 
destroyed when the banned_nodes array gets destroyed
(and so that when this struct is destroyed, that any pending 
ctdb_ban_timeout events are also destroyed.)

othervise we may end up with multiple ban_timeout timed events going in 
parallell since we destroy/recreate the banned_nodes structure during 
election   but we never destroy/recreate the rec structure.

(This used to be ctdb commit fbd663d56a2a4421a5c0e541962c87e2e9c7cd82)
2007-12-03 11:39:17 +11:00
Ronnie Sahlberg
ad6abacca7 merge from tridge
(This used to be ctdb commit db2f9197ede28cc19c190c38e977bff09f13b729)
2007-12-03 10:21:45 +11:00
Andrew Tridgell
7edb41692e merge from ronnie
(This used to be ctdb commit 6653a0b67381310236e548e5fc0a9e27209b44e0)
2007-12-03 10:19:24 +11:00
Ronnie Sahlberg
2f1baf34d3 up the loglevel for the enable/disable monitoring to level 1
(This used to be ctdb commit 5043a0afeedbd30c7f64c2733c8ae5bf75479a98)
2007-12-01 10:06:42 +11:00
Ronnie Sahlberg
07dd0f6ff0 log that monitoring has been "disabled" not that it has been "stopped"
when monitoring is disabled

(This used to be ctdb commit e7c92f661a523deae9544b679d412ae79cc0ede7)
2007-11-30 10:53:35 +11:00
Ronnie Sahlberg
975fbc8e22 always set up a new monitoring event regardless of whether monitoring is
enabled or not

(This used to be ctdb commit c3035f46d1a65d2d97c8be7e679d59e471c092c2)
2007-11-30 10:14:43 +11:00
Ronnie Sahlberg
50573c5391 add ctdb_disable/enable_monitoring() that only modifies the monitoring
flag.
change calling of the recovered/takeip/releaseip event scripts to use 
these enable/disable functions instead of stopping/starting monitoring.

when we disable monitoring we want all events to still be running
in particular the events to monitor for dead nodes  and we only want to 
supress running the monitor event scripts

(This used to be ctdb commit a006dcc4f75aba950dd701ad7d1a84e89df285e8)
2007-11-30 10:09:54 +11:00
Ronnie Sahlberg
0eb6c04dc1 get rid of the control to set the monitoring mode.
monitoring should always be enabled
(though a node may want to temporarily disable running the "monitor"
event scripts but can do so internally without the need for this 
control)

(This used to be ctdb commit e3a33618026823e6af845fd8513cddb08e6b5584)
2007-11-30 10:00:04 +11:00
Ronnie Sahlberg
192ba82b73 ->monitor_context is NULL when monitoring is disabled.
Check whether monitoring is enabled or not before creating new events
and log why the event is not set up othervise

(This used to be ctdb commit 2f352b2606c04a65ce461fc2e99e6d6251ac4f20)
2007-11-30 09:02:37 +11:00
Ronnie Sahlberg
8ac8cce487 dont manipulate ctdb->monitoring_mode directly from the SET_MON_MODE
control, instead call ctdb_start/stop_monitoring()

ctdb_stop_monitoring() dont allocate a new monitoring context, leave it 
NULL. Also set the monitoring_mode in this function so that 
ctdb_stop/start_monitoring() and ->monitoring_mode are kept in sync.
Add a debug message to log that we have stopped monitoring.

ctdb_start_monitoring()  check whether monitoring is already active and 
make the function idempotent.
Create the monitoring context when monitoring is started.
Update ->monitoring_mode once the monitoring has been started.
Add a debug message to log that we have started monitoring.

When we temporarily stop monitoring while running an event script,
restart monitoring after the event script wrapper returns instead of in 
the event script callback.

Let monitoring_mode start out as DISABLED and let it be enabled once we call ctdb_start_monitoring.

dont check for MONITORING_DISABLED in check_fore_dead_nodes(). If 
monitoring is disabled, this event handler will not be called.

(This used to be ctdb commit 3a93ae8bdcffb1adbd6243844f3058fc742f76aa)
2007-11-30 08:44:34 +11:00
Ronnie Sahlberg
5c3a270991 move ctdb_set_culprit higher up in the file
when we are the recmaster and we update the local flags for all the 
nodes, if one of the nodes fail to respond and give us his flags,
set that node as a "culprit"

as one of the first things to do in the monitor_cluster loop, check if 
the current culprit has caused too many (20) failures and if so ban that 
node.


this is for the situation where a remote node may still be CONNECTED but 
it fails to respond to the getnodemap control  causing the recovery 
master to loop in monitor_cluster   aborting the monitoring when the 
node fails to respond   but before anything will trigger a call to 
do_recovery().
If one or more of the databases or nodes are frozen at this stage, this 
would lead to smbd being blocked for potentially a longish time.

(This used to be ctdb commit 83b0261f2cb453195b86f547d360400103a8b795)
2007-11-28 15:04:20 +11:00