1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-14 19:24:43 +03:00

2218 Commits

Author SHA1 Message Date
Ronnie Sahlberg
eb9a77c887 version 1.0.94
(This used to be ctdb commit 5cb4d63bf6887d15aba37fafc3f6b6ba38027f13)
2009-10-08 19:17:57 +11:00
Ronnie Sahlberg
342148628f if a node fails to become frozen during recovery, mark it up with as a culprit so it will soon get banned
(This used to be ctdb commit f72d33ac73ebb1af802bacdfb30279df3cd8b8f9)
2009-10-08 16:45:25 +11:00
Ronnie Sahlberg
d29c4b5c4d version 1.0.93
(This used to be ctdb commit e77bf5708df6782b4516f698b9981a1d27e2f10b)
2009-10-06 17:05:14 +11:00
Ronnie Sahlberg
42193cbff8 update natgw eventscript to allow you to fore it to update and / or to remove the configuration at runtime
(This used to be ctdb commit deed52b7e4aac94b4d11a8d89d08739e1dfd4ed7)
2009-10-06 16:09:24 +11:00
Martin Schwenke
2fa921ba92 Merge commit 'origin/master'
(This used to be ctdb commit 7d91de8a837a12082c343980428153720dcad741)
2009-10-06 13:39:31 +11:00
Martin Schwenke
47f5347963 Document CTDB_NODES_FILE environment variable used by onnode.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 22f0065cd6b66fa0f623f465aaca98883955ac79)
2009-10-06 13:38:00 +11:00
Ronnie Sahlberg
134ed842fa always send the release/take ip controls to make sure all nodes are updated
(This used to be ctdb commit 789703ea684717781c176fd3a2a24d96abde220b)
2009-10-06 12:25:44 +11:00
Ronnie Sahlberg
166b1c97b4 add a new message to ask the recovery daemon to temporarily disable checking ip address consistency.
This is useful when we are moving addresses using moveip in the cluster since otherwise if we collide with the recovery daemons own check we could cause a recovery

(This used to be ctdb commit 9c63858c0b22c81eaccb9865a414af0bbb2833d4)
2009-10-06 12:11:32 +11:00
Ronnie Sahlberg
617e393f6b update addip/moveip/delip to make it less likely to trigger an accidental recovery
(This used to be ctdb commit 3befe5526e147d49451fddc930aaafc3dbe2e9c1)
2009-10-06 11:41:18 +11:00
Ronnie Sahlberg
50712d48d3 change some loglevels and also pront the pnn of the ip for takeip/releaseip logging
(This used to be ctdb commit 9d95dfbd12898975ba0d8560d95a974210d3de7c)
2009-10-06 11:40:38 +11:00
Ronnie Sahlberg
71e4259150 add a new function to collect a list of all active nodes EXCEPT a certain node
(This used to be ctdb commit be52954d921e7d443304cf49fbd488c619a9c4ec)
2009-10-06 10:52:31 +11:00
Ronnie Sahlberg
3133dadd8f allocate takeoverip state as a child of vnn and also make the takeocerip context a child of vnn
(This used to be ctdb commit 804e5905be51f43c8a338bfbe216fd8d5718850f)
2009-10-06 09:35:15 +11:00
Ronnie Sahlberg
709fc77878 When adding a public ip to a node, make sure to push the assignment of ip addresses out to all nodes so all nodes become aware who currently holds the ip.
(This used to be ctdb commit e8df6fc301fb7faf72c72eb39ea68d44d1526b00)
2009-10-06 08:19:25 +11:00
Ronnie Sahlberg
1d60064139 version 1.0.92
(This used to be ctdb commit 9ffb0d08d34cbafed0e49350a3a72b15d92c8ea7)
2009-10-02 14:38:16 +10:00
Ronnie Sahlberg
f8334e2f68 we should close this file on exec
(This used to be ctdb commit c1c0ebb8da9a6c29ee83868a311f07f30cb4ed16)
2009-10-02 13:41:54 +10:00
Ronnie Sahlberg
2ab8f6a368 Merge commit 'martins/master'
(This used to be ctdb commit 9b206d96da3341836cc25aee5693f551f6f3a80e)
2009-10-01 15:46:01 +10:00
Martin Schwenke
3edf5532d5 Test suite: The ctdb ping test should allow time to go backwards.
Time can actually go backwards during this test if ntpd happens to
adjust it little bit.  So we should cope...

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 23ae9e9863ea90c6fb3f105403fd098041fa73f4)
2009-10-01 15:39:09 +10:00
Ronnie Sahlberg
dfc2500a1f dont exit on a commit failure
(This used to be ctdb commit 4e9a3a5dc232bac12ab387ea0cf4f1b279bed5c1)
2009-10-01 14:53:35 +10:00
Ronnie Sahlberg
63278ad040 Revert "Revert "allow the transaction commit to fail""
This reverts commit 74e416108df6934f45ca646d709785dd76ab3c35.

(This used to be ctdb commit d1d370033d5007ad1c2c34cd9eeac53001f4b13e)
2009-10-01 14:51:32 +10:00
Ronnie Sahlberg
32286b08ac document how to use the notification script
(This used to be ctdb commit b77e4698e7f83443243965f93b84237f2903cd46)
2009-10-01 14:31:55 +10:00
Ronnie Sahlberg
e90dd8015f add a new notification to trigger on when ctdb has started
(This used to be ctdb commit b1fe04f2e9447f762a0b805763deb29296585ff8)
2009-10-01 14:05:30 +10:00
Martin Schwenke
b27600253d Minor fixes to 01.reclock eventscript.
test -z really needs its argument to be quoted.  Simplified a status
test.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit fe26da7780545b1ecc0a7da5bc1cf8beaeea94cc)
2009-09-30 21:21:56 +10:00
Martin Schwenke
78b7043411 40.vsftpd monitor event only fails after 2 failures to connect to port 21.
Change the monitor event in 40.vsftpd so it only fails if there are 2
successive failures connecting to port 21.  This reduces the
likelihood of unhealthy nodes due to vsftpd being restarted for
reconfiguration due to node failover or system reconfiguration.

New eventscript functions ctdb_counter_init, ctdb_counter_incr,
ctdb_counter_limit.  These are used to count arbitrary things in
eventscripts, depending on the eventscript name and a tag that is
passed, and determine if a specified limit has been hit.  They're good
for counting failures!

These functions are used in 40.vsftpd and also in 01.reclock - the
latter used to do the counting without these functions.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896)
2009-09-30 21:05:16 +10:00
Martin Schwenke
e169ba85f3 Merge commit 'origin/master'
(This used to be ctdb commit 803cfb4cd2f6d139f466053a6d7e104fcb772ef5)
2009-09-30 19:22:59 +10:00
Ronnie Sahlberg
11c56dfd56 New version 1.0.91
(This used to be ctdb commit d1332f4d5d3d3e4b4e0cd362a6903d09e0d5fcbb)
2009-09-29 13:31:41 +10:00
Ronnie Sahlberg
c971d934a9 From Wolfgang Mueller-Friedt
Remove the explicit vacuum/repack commands from the 00.ctdb eventscript
and implement this in the ctdb daemon.

Combine vacuuming and repacking into one
cheap read traverse to enumerate all candidate records
and one write traverse that both repacks the database and also deletes the record locally where we are lmaster and where the records have already been deleted remotely.

this code also adds initial autotuning heuristics for the vacuum intervals and how many records to delete in each iteration.

minor stylish changes made by ronnie s

(This used to be ctdb commit 95a3ee551241aa164967991fe5efe078e1714bde)
2009-09-29 13:27:19 +10:00
Martin Schwenke
e976209996 Merge commit 'origin/master'
(This used to be ctdb commit 096cdc0c12d22d99f8405bee5cb9f05c616c8492)
2009-09-29 12:59:10 +10:00
Ronnie Sahlberg
9bac6f2e2c change the reclock fail count to 19 monitor intervals before we shut down ctdbd
(This used to be ctdb commit 6e35feb06ec036b9036c5d1cdd94f7cef140d8a6)
2009-09-28 14:12:59 +10:00
Ronnie Sahlberg
4f0f2cc196 add a new eventscript 01.reclock
if the reclock file has been set, then this script will test that the
    reclock file can actually be accessed.
    if the file does not exist, or if the attempts to stat the file hangs,
    the node will be marked unhealthy after the third failed monitoring event
    and after the tenth failure, ctdb itself will shutdown.

(This used to be ctdb commit 2cb04747887674def299e574fccb827c1c3194e7)
2009-09-28 14:06:40 +10:00
Ronnie Sahlberg
22dde50be3 add machinereadable output for the ctdb getreclock command
(This used to be ctdb commit 5e7dc36f1649824db2f9dab34bede8b388502a57)
2009-09-28 13:39:54 +10:00
Martin Schwenke
4948051bf4 Test suite: Print debug info on node status timeouts.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a083a1976d621c76121f1fa2c2f484cfa47267bd)
2009-09-25 18:00:17 +10:00
Ronnie Sahlberg
9add8cdc5a Merge commit 'obnox/master-rebase'
(This used to be ctdb commit edb58a417bfeb094cbbbf96caec8e2918256dad9)
2009-09-25 17:34:59 +10:00
Ronnie Sahlberg
a74ca1a1bb Merge root@10.1.1.27:/shared/ctdb/ctdb-git
(This used to be ctdb commit db7195d762f69577c4e28f0b0e0ded0ac7f91f0b)
2009-09-25 13:18:18 +10:00
Ronnie Sahlberg
a82b9cfbfd with the new banning logic with one struct for each node we no longer "forget" the other culprits as often as we used to do, which means that things like "ctdb recover" can now actually lead to a node becomming banned if we perform too many recoveries too frequently.
change this to provide absolution to all nodes once they have participated in a recovery session.

(This used to be ctdb commit f66d17fb2e81a35d5adb3754e1cc902f76b4590a)
2009-09-25 13:14:53 +10:00
Michael Adam
d0289c650e Revert "dont check if commit failed, we do allow the commit to fail sometimes"
This reverts commit affa6f47432507e84b7e76b88a2c27fff8e6e2e4.

Transaction commit should not be allowed to fail.
This is a fatal error.

Michael

(This used to be ctdb commit 4364419a486c1995bea56dab603cc4960e7c8e7a)
2009-09-21 11:16:18 +02:00
Michael Adam
fcaca26ec4 Revert "allow the transaction commit to fail"
This reverts commit 7a6134e684c9ac4763bf198ef1410867b6082c94.

Transaction commit should not be allowed to fail.
This is a fatal error.

Michael

(This used to be ctdb commit 74e416108df6934f45ca646d709785dd76ab3c35)
2009-09-21 11:16:18 +02:00
Michael Adam
3cb4bcd211 ctdb_client: fix race in starting concurrent transactions on a single node
There are two races in concurrent transactions on a single node.
One in starting a transaction, and one with committing (replaying).

This commit closes the first race by storing the pid in the
transaction-lock record and comparing the own pid against it
as a measure to prevent starting a second transaction when
a second node has come inbetween and changed the pid in the lock
record.

Michael

(This used to be ctdb commit 84e5a55a900b01903b80e23045edfc726d8d77a1)
2009-09-21 11:16:18 +02:00
Ronnie Sahlberg
eb305efdb0 Merge commit 'martins/master'
(This used to be ctdb commit 0e6a52ee66830e7742eaa392cd3dd9caeb808fb3)
2009-09-18 14:23:37 +10:00
Ronnie Sahlberg
4b7f6c8a29 dont mark the recovery daemon as a ban culprit just because a node in the cluster was set to recvoery mode == ACTIVE.
This happens normally when someone explicitely triggers a recovery using "ctdb recover"

(This used to be ctdb commit 3085170be8460e59996a3eee4e29fec9ddbcf0f8)
2009-09-18 12:58:30 +10:00
Ronnie Sahlberg
4a05b2dfd8 try restarting ststd indefinitely not just once
(This used to be ctdb commit 03b0d913ae009284e2fadda1b9246ec77d19db29)
2009-09-15 19:33:53 +10:00
Ronnie Sahlberg
029fd6b00f Revert "try to restart statd everytime it fails, not just the first time"
This reverts commit 4f7b39a4871af28df1c4545ec37db179fa47a7da.

(This used to be ctdb commit db7b96304e4725f29b12398b7582e385daed63ed)
2009-09-15 19:33:35 +10:00
Ronnie Sahlberg
59cacded72 try to restart statd everytime it fails, not just the first time
(This used to be ctdb commit 4f7b39a4871af28df1c4545ec37db179fa47a7da)
2009-09-15 13:35:58 +10:00
Ronnie Sahlberg
c3556c3d88 Merge commit 'obnox/master-rebase'
(This used to be ctdb commit 1ae3a40705e14efcc24f558cd4d677932765c4fd)
2009-09-15 08:05:33 +10:00
Ronnie Sahlberg
ee9fe64029 Merge root@10.1.1.27:/shared/ctdb/ctdb-git
(This used to be ctdb commit b5410e7be0525e6e5cd49ccebc7bbc57086f3cb2)
2009-09-12 07:05:21 +10:00
Ronnie Sahlberg
6e793bec7c new version 1.0.90
(This used to be ctdb commit 5624da65d3fad1905c9f93a9e41a90b98ad692d2)
2009-09-12 07:30:18 +10:00
Martin Schwenke
3d8fa9e9e3 Test suite: Update "complex" tests for wait_until_node_has_status() change.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 31216fd48117526c943e42d137ce24ef89fa0009)
2009-09-11 16:15:31 +10:00
Martin Schwenke
b8b28cb567 Test suite: wait_until_node_has_status() now uses "onnode any".
Many tests currently do this sort of thing:

  onnode 0 $CTDB_TEST_WRAPPER wait_until_node_has_status 1 disconnected

In fact, they all use exactly the same "onnode 0 $CTDB_TEST_WRAPPER"
idiom.  This is both repetitious and dangerous, since node 0 might be
shutdown during a test.  Instead, we push "onnode any
$CTDB_TEST_WRAPPER" (which selects a connected node) into
wait_until_node_has_status() and just call that function directly in
tests, like this:

  wait_until_node_has_status 1 disconnected

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a2aaef03d4d6bbd4b42f50f732254935d4d3469c)
2009-09-11 15:55:53 +10:00
Martin Schwenke
7e09c07a34 Test suite: Rework the cluster (re)start code.
Make it possible to start on only 1 node - for tests that need to
restart a particular node.

_ctdb_hack_options() attempts to see what options are being passed to
a daemon that is being run via the initscript.  It then sets a
corresponding environment variable that the initscript knows about.
Currently only the --start-as-stopped option is supported.  This is
extremely ugly but it seems like the only way...  :-(

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 407b3117dfc1072117abf681ec98b9e252d8744c)
2009-09-11 14:06:12 +10:00
Michael Adam
e80a7001ff Introduce sysconfig variable CTDB_SYSLOG=yes/no (default "no").
This allows for controlling start of ctdbd with or without the option "--syslog"
from the sysconfig/ctdb file.

Michael

(This used to be ctdb commit 7bf9fff9139a4270496bddb97f9433bab87824bf)
2009-09-09 09:52:14 +02:00
Michael Adam
4c78f88dff ctdb_logging: fix a comment typo.
Michael

(This used to be ctdb commit e5ba8e1a832c223496ad72209ce1d3203cdaa2d7)
2009-09-09 09:52:13 +02:00