1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-12 09:18:10 +03:00
Commit Graph

2133 Commits

Author SHA1 Message Date
Ronnie Sahlberg
6450ae533a Dont even try allocating and sending a CALL packet if the transport is down
(This used to be ctdb commit cb8dd896914d4e44ad7b8bb000176a7c78f394ae)
2009-06-30 12:16:13 +10:00
Ronnie Sahlberg
127754e192 failing a dmaster send due to the transport being down is fatal
(This used to be ctdb commit c17dafc79bec25bbb796478c33f503503d382a20)
2009-06-30 12:14:58 +10:00
Ronnie Sahlberg
757ba01ddc if we fail a dmaster migration due to the transport being down, then that is a fatal condition.
(This used to be ctdb commit 75dea671f68ac6649095357c36b3697a927721e9)
2009-06-30 12:13:15 +10:00
Ronnie Sahlberg
dd1774cd85 dont try to send error packets if the transport is down
(This used to be ctdb commit 65b94d280731df3245b26d69f39acfaf5bccf0d8)
2009-06-30 12:10:27 +10:00
Ronnie Sahlberg
d4b30b34aa dont even try to send a message from the main daemon if the transport is down
(This used to be ctdb commit 9a2c4c3ed09ac9ea781d06999d11e5c3b5b4a97a)
2009-06-30 12:09:28 +10:00
Ronnie Sahlberg
9e5064dcea Dont try to allocate and send packets if the transport is down
(This used to be ctdb commit 945f04f06a425fd3940a2e4b832c63223a3f26b3)
2009-06-30 12:03:12 +10:00
Ronnie Sahlberg
22fb69d337 dont even try to allocate a packet if the transport is down since it will fail
(This used to be ctdb commit a73f316cb9cec877dc0bc3f7baa21be1b1454273)
2009-06-30 11:55:42 +10:00
Ronnie Sahlberg
243bb51f02 New version 1.0.86
(This used to be ctdb commit 841a2d9635341baa1a6dd9ec558fc7cadb4e3af4)
2009-06-30 09:09:06 +10:00
Ronnie Sahlberg
ce54b6dc8b update the man pages with the "getreclock" and "setreclock" commands.
(This used to be ctdb commit 3db8b1d7425ed5bd41e58b43c55fdac517d71baf)
2009-06-25 14:45:57 +10:00
Ronnie Sahlberg
816db4be38 Do not allow the "VerifyRecoveryLock" tunable to be changed if there is no reclock file
(This used to be ctdb commit 5334e40978350b6b597ee020bac52e37c8f9a8ba)
2009-06-25 14:45:17 +10:00
Ronnie Sahlberg
969cb64056 disable VerifyRecoveryLock when the user modifies the filename
(This used to be ctdb commit d973cb6e83b2f7cc37bd39c1219dcfbd4911a8ee)
2009-06-25 14:34:21 +10:00
Ronnie Sahlberg
5b235c3999 add a control to set the reclock file
(This used to be ctdb commit 36cc2e586f03fa497ee9b06f3e6afc80219c4aaa)
2009-06-25 14:25:18 +10:00
Ronnie Sahlberg
7f8d98ebb0 update the recovery daemon to read the recovery lock file off the main daemon and handle when the file is changed/enabled/disabled
(This used to be ctdb commit 31acc11a6389d4dd9f7b71b7cfa2f2450076f1f7)
2009-06-25 12:55:43 +10:00
Ronnie Sahlberg
10db6a41df return NULL and not a "" when there is no reclock file returned from the server
(This used to be ctdb commit 6755f89f81aba63bfe00ee16d44a0201cbfa90ca)
2009-06-25 12:26:14 +10:00
Ronnie Sahlberg
2b253c094c add a control to read the current reclock file from a node
(This used to be ctdb commit ed6a4cbcdcbb4e0df83bec8be67c30288bf9bd41)
2009-06-25 12:17:19 +10:00
Ronnie Sahlberg
4a1a3652fe Document that you can run ctdb without a reclock file in the sysconfig file
(This used to be ctdb commit 33895d217ee096b356f02b5292ba27a840c4f559)
2009-06-25 11:59:21 +10:00
Ronnie Sahlberg
77ef745394 Allow setting the recovery lock file as "", which means that we do not use a file and that we implicitely also disable the recovery lock checking.
Update the init script to allow starting without a reclock file.

(This used to be ctdb commit 07855ff5eba71e7d607d52e234a42553d9b93605)
2009-06-25 11:50:45 +10:00
Ronnie Sahlberg
180a576f7b Dont access the reclock file at all if VerifyRecoveryLock is zero and also
make sure the reclock file is closed if the variable is cleared at runtime

(This used to be ctdb commit a25f4888689a0725971606163d87c39a41669292)
2009-06-25 11:41:18 +10:00
Ronnie Sahlberg
52861523f6 new version 1.0.85
(This used to be ctdb commit a4b682e3b2657abeca3e387d96949f83bdbd7b2f)
2009-06-23 11:30:25 +10:00
Ronnie Sahlberg
5f680fa2b4 rename 99.routing to 11.routing so that it executed before the service scripts
(This used to be ctdb commit 9bc8e7eec7ffa8969f0f170a77b13cd0033790f1)
2009-06-23 11:29:26 +10:00
Martin Schwenke
566314ca97 Fix minor problem in previous initscript commit.
The valgrind start case should not use daemon, since this is specific
to Red Hat.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 867f57d166395c92949e480ca725249b0ca8950b)
2009-06-19 18:08:54 +10:00
Martin Schwenke
3dad79b88e Initscript fixes, mostly for "stop" action.
Use a local variable $ctdbd so that we always run ctdbd from the the
same place and so that we know what to kill.  This variable respects
the $CTDBD environment variable, which may be used to specify an
alternative location for the daemon.

In the important cases use "pkill -0 -f" to check if ctdbd is
running.  Also, remove the special case for killing ctdbd when running
under valgrind.  The regular case will handle this just fine.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 070305adfe636c2580776e6bf24bb8be06622b86)
2009-06-19 18:08:31 +10:00
Martin Schwenke
7bfc19d635 Clean up handling the of CTDB restarts in testcases.
Glitches during restarts of the CTDB cluster have been causing some
tests to fail.  This is because restarts are initiated in the body of
many tests.  This adds a simple function ctdb_restart_when_done, which
schedules a restart using an existing hook in the test exit code.
This function is now used in tests that need to restart CTDB.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit fc69b6a66282d5be6edeb286bf72aeafb252e6dd)
2009-06-19 18:03:14 +10:00
Martin Schwenke
635da189dc Fix minor onnode bugs relating to local daemons.
Commit a0f5148ac749758e2dfbd6099e829c5bf1d900e6 caused a subtle
regression.  Due to the subtlety, this description is much longer than
the 1 line patch that fixes it!  The regression, where a process that
invokes onnode is unexpectedly blocked, is only apparent if the
following conditions are met:

1. $CTDB_NODES_SOCKETS is set;
2. The command passed to onnode attempts to background a process; and
3. onnode is run in certain types of subshell (e.g. foo=$(onnode ...)).

In particular, when testing against local daemons (i.e. condition (1)
is met), tests/simple/07_ctdb_process_exists.sh would fail (because it
does both (2), (3)).

The problem is caused by the use of file descriptor 3 in the code that
allows separate filtering of stdout and stderr.  A backgrounded
process will have this descriptor open and the $(...) construct
appears to wait for all file descriptors to be closed.  This only
happens with local daemons because SSH is replaced by a shell and file
descriptor 3 leaks into that shell.  It does not occur when SSH is
used because the file descriptor does not leak into the remote shell
where the process is backgrounded.

The fix is simply to redirect file descriptor 3 to /dev/null in the
fakessh function, which is used when $CTDB_NODES_SOCKETS is set.

Also fixed is another minor bug when the -o option and
$CTDB_NODES_SOCKETS are used in combination.  The code uses the node
name as a suffix for the output filename(s).  Usually this is an IP
address.  However, when $CTDB_NODES_SOCKETS is in use the node name is
the socket name, which might be a path several directories deep.
Each output file is created via a simple redirection and this would
fail if unexpected directories appear in the filename.  3 possible
fixes were considered:

1. Replace all '/'s in the node name by '_'s.  Nice and simple.
2. Use the basename of the node name.  However, sockets may be in
   different directories but have the same basename.
3. Create all required directories before redirecting.  This is a
   little more complex and probably doesn't meet the user's
   expectations.

Option (1) is implemented here.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5d320099025b6835eda3a1e431708f7e0a6b0ba6)
2009-06-19 18:02:17 +10:00
Ronnie Sahlberg
de1402d471 dont log an error if waitpid returns -1 and errno is ECHILD
(This used to be ctdb commit fdf50f3e774e3980af81c0b6f4ff81d085f4f697)
2009-06-19 15:55:13 +10:00
Ronnie Sahlberg
baead0fdcc dont leak file descriptors when set recmdoe timesout
(This used to be ctdb commit fc8a364eb095ec11ca01246a583bf1dc53510141)
2009-06-19 14:58:06 +10:00
Ronnie Sahlberg
d3c5fb4bd1 dont leak file descriptors
(This used to be ctdb commit 268c3e4b269a92741a02280c84384178e73de10e)
2009-06-19 14:54:22 +10:00
Ronnie Sahlberg
d72b14e86c in the recovery daemon, check that the recovery master can access the recovery lock file and verify it is not stale from a child process.
This allows us to timeout the operation if the underlying filesystem has become temporarily unresponsive without causing a new recovery.

(This used to be ctdb commit d177b08f1dc79534491f27726b05405d47e12e20)
2009-06-19 14:44:26 +10:00
Ronnie Sahlberg
1183b364f1 reduce the timeout we wait for the reclock child process to finish to 5 seconds
before we log an error and abort

(This used to be ctdb commit 6d1e4321b63973c2e53c63d386e8cc0bd9605cae)
2009-06-19 13:09:11 +10:00
Martin Schwenke
4697829e7c Fix minor onnode bugs relating to local daemons.
Commit a0f5148ac749758e2dfbd6099e829c5bf1d900e6 caused a subtle
regression.  Due to the subtlety, this description is much longer than
the 1 line patch that fixes it!  The regression, where a process that
invokes onnode is unexpectedly blocked, is only apparent if the
following conditions are met:

1. $CTDB_NODES_SOCKETS is set;
2. The command passed to onnode attempts to background a process; and
3. onnode is run in certain types of subshell (e.g. foo=$(onnode ...)).

In particular, when testing against local daemons (i.e. condition (1)
is met), tests/simple/07_ctdb_process_exists.sh would fail (because it
does both (2), (3)).

The problem is caused by the use of file descriptor 3 in the code that
allows separate filtering of stdout and stderr.  A backgrounded
process will have this descriptor open and the $(...) construct
appears to wait for all file descriptors to be closed.  This only
happens with local daemons because SSH is replaced by a shell and file
descriptor 3 leaks into that shell.  It does not occur when SSH is
used because the file descriptor does not leak into the remote shell
where the process is backgrounded.

The fix is simply to redirect file descriptor 3 to /dev/null in the
fakessh function, which is used when $CTDB_NODES_SOCKETS is set.

Also fixed is another minor bug when the -o option and
$CTDB_NODES_SOCKETS are used in combination.  The code uses the node
name as a suffix for the output filename(s).  Usually this is an IP
address.  However, when $CTDB_NODES_SOCKETS is in use the node name is
the socket name, which might be a path several directories deep.
Each output file is created via a simple redirection and this would
fail if unexpected directories appear in the filename.  3 possible
fixes were considered:

1. Replace all '/'s in the node name by '_'s.  Nice and simple.
2. Use the basename of the node name.  However, sockets may be in
   different directories but have the same basename.
3. Create all required directories before redirecting.  This is a
   little more complex and probably doesn't meet the user's
   expectations.

Option (1) is implemented here.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c97d56d93d9c1007a4e85affb19ed0c2d0e11b6d)
2009-06-19 12:12:39 +10:00
Martin Schwenke
62871fbcd5 Clean up handling the of CTDB restarts in testcases.
Glitches during restarts of the CTDB cluster have been causing some
tests to fail.  This is because restarts are initiated in the body of
many tests.  This adds a simple function ctdb_restart_when_done, which
schedules a restart using an existing hook in the test exit code.
This function is now used in tests that need to restart CTDB.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d440e83bb4f0c19c085915d0f0e87cc0dabbc569)
2009-06-19 11:40:09 +10:00
Ronnie Sahlberg
0ddf79a3bc increase the timeout before we shutdown when ther ecovery daemon is hung
(This used to be ctdb commit facddcacb4a961cddb117818fa38a3e97770b2fa)
2009-06-18 09:20:18 +10:00
Ronnie Sahlberg
34fbfb8b89 rename 99.routing to 11.routing
so it is executed before any of the service scripts

(This used to be ctdb commit 1205673499618f90f413fad9e96a88733b5ce359)
2009-06-18 09:11:46 +10:00
Martin Schwenke
b0fd8fffcf New tests for NFS and CIFS tickles.
New tests/complex/ subdirectory contains 2 new tests to ensure that
NFS and CIFS connections are tracked by CTDB and that tickle resets
are sent when a node is disabled.

Changes to ctdb_test_functions.bash to support these tests.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5d188af387a2a1d68d66f47edb7a9ca546ed357c)
2009-06-18 09:04:43 +10:00
Martin Schwenke
133826f5da Increase threshold in 51_ctdb_bench from 2% to 5%.
The threshold for the difference in the number messages sent in either
direction around the ring of nodes was set to 2%.  Something
environmental is causing this different to sometimes be as high as 3%.
We're confident it isn't a CTDB issue so we're increasing the
threshold to 5%.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit be3e23c9fcb9c716e492af102830a4f6ad8bda7b)
2009-06-18 09:02:21 +10:00
Martin Schwenke
1f3a602b88 Merge commit 'origin/master'
(This used to be ctdb commit 8ddd5165f573fc6beaae589b86a6afa4bc17f32a)
2009-06-16 12:56:55 +10:00
Martin Schwenke
ffff61c13b New tests for NFS and CIFS tickles.
New tests/complex/ subdirectory contains 2 new tests to ensure that
NFS and CIFS connections are tracked by CTDB and that tickle resets
are sent when a node is disabled.

Changes to ctdb_test_functions.bash to support these tests.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 31cc46eb157ca1301312f14879e4fb4da7d81088)
2009-06-16 12:47:59 +10:00
Martin Schwenke
ad3c89095e Make 51_ctdb_bench.sh more tolerant.
Limit the allowable difference in message counts in either direction
around the ring to 5% (up from 2%).  There is something environmental
making this blow out to 3% very occasionally when there's no obvious
problem with ctdb.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d6e6909ac629212b3028e13b958e1a17c64bee8c)
2009-06-10 16:15:09 +10:00
Ronnie Sahlberg
d1c40424f6 When we ban a node, only drop the IPs on the node being banned, not on every node
(This used to be ctdb commit 46e8c3737e6ff54fc80de8e962e922924c27bc35)
2009-06-10 10:35:20 +10:00
Ronnie Sahlberg
2bb687c4cd remove unused variable
(This used to be ctdb commit 2a52336ec021dfe8d56ba72726feb7b2dbd41f68)
2009-06-09 10:58:46 +10:00
Ronnie Sahlberg
ac931b1371 dont require particular values for NoIPFailback and DeterministicIPs when
using ctdb moveip

(This used to be ctdb commit d350c631850377c09968d2978ef57d2bd0d50116)
2009-06-09 10:57:46 +10:00
Ronnie Sahlberg
f135684766 improve ctdb moveip so that it does not always trigger a recovery.
(This used to be ctdb commit 0ca28d7336463ecd2ff65620d8dbcbb496991531)
2009-06-09 10:56:50 +10:00
Ronnie Sahlberg
f6ccf96898 try avoiding to cause a recovery when deleting a public ip from a node
(This used to be ctdb commit 6318ea13464e2fe630084c40802d8e697c2cb999)
2009-06-05 17:57:14 +10:00
Ronnie Sahlberg
b046f5e3aa when adding an ip, try manually adding and takingover the ip instead of triggering a full recovery to do the same thing
(This used to be ctdb commit 4d5d22e64270cfb31be6acd71f4f97ec43df5b2c)
2009-06-05 17:00:47 +10:00
Ronnie Sahlberg
79eef7f2b5 dont list DELETED nodes in the ctdb listnodes output
(This used to be ctdb commit 7eb137aa4c24c69bd93b98fb3c7108e5f3288ebd)
2009-06-04 13:25:58 +10:00
Ronnie Sahlberg
f691b96d84 make it possible to run 'ctdb listnodes' also if the daemon is not running.
in this case, read the nodes file directly instead of asking the local daemon for the list.

add an option -Y to provide machinereadable output to listnodes

(This used to be ctdb commit 4a55cacc4f5526abd2124460b669e633deeda408)
2009-06-04 13:21:25 +10:00
Ronnie Sahlberg
85d67197fe From William Jojo <w.jojo[AT]hvcc.edu>
AIX dont have getopt.h by default.
Dont try including this file when building on AIX

(This used to be ctdb commit 06b33a826e71e1dd2f9e02ad614be55535d42045)
2009-06-04 09:41:05 +10:00
Martin Schwenke
0219d12fd4 Merge branch 'init_rewrite'
Conflicts:
	config/ctdb.init

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 92be87b5bfed7882b48f4034c82dfdb031f3afdc)
2009-06-02 16:40:01 +10:00
Martin Schwenke
1c2e7871eb Merge commit 'origin/master'
(This used to be ctdb commit 135b72828fc76856fa8f6d7f9c820120de05596b)
2009-06-02 16:29:25 +10:00
Martin Schwenke
b1b1cbb274 Initscript cleanups.
* Move building of CTDB_OPTIONS to new function build_ctdb_options()
  and have it use a helper function for readability.

* New functions check_persistent_databases() and set_ctdb_variables().

* Remove valgrind-specific stop code, since the general pkill should
  kill ctdbd when running under valgrind.

* Remove some bash-isms (e.g. >& /dev/null) since the script is /bin/sh.

* Make indentation consistent.

* Minor clean-ups.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 951dbcb29fd53cf51a08958efe185db4954d24f3)
2009-06-02 16:07:08 +10:00
Martin Schwenke
1f9ef465e3 Fix minor problem in previous initscript commit.
The valgrind start case should not use daemon, since this is specific
to Red Hat.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 1ea6af7007fe3b5a48d48440a0924c71d7a6000a)
2009-06-02 15:54:04 +10:00
Ronnie Sahlberg
e2810c0cb4 new version 1.0.84
(This used to be ctdb commit af1b3de978089a9819716b33c13c941b5558cb17)
2009-06-02 15:05:41 +10:00
Ronnie Sahlberg
45aa542064 teach ONNODE about deleted nodes
(This used to be ctdb commit 03d304e72a5839dc8d8d2e2312b346c21dca5774)
2009-06-02 15:03:44 +10:00
Ronnie Sahlberg
f49c71fa4f new version 1.0.83
(This used to be ctdb commit f236fa289f3115b1f4eb108eb668392dc520f61a)
2009-06-02 13:13:03 +10:00
Ronnie Sahlberg
676f7e0206 idocument how to remove a node from an existing cluster using 'ctdb
reloadnodes'

(This used to be ctdb commit e3d9722e332f132bd47dc41621d4e1d2b5c9c62a)
2009-06-02 12:43:11 +10:00
Martin Schwenke
4a09cc639b Initscript fixes, mostly for "stop" action.
Use a local variable $ctdbd so that we always run ctdbd from the the
same place and so that we know what to kill.  This variable respects
the $CTDBD environment variable, which may be used to specify an
alternative location for the daemon.

In the important cases use "pkill -0 -f" to check if ctdbd is
running.  Also, remove the special case for killing ctdbd when running
under valgrind.  The regular case will handle this just fine.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ee5d49324155e3e51371f6f8e5ed9eef4179f08d)
2009-06-02 10:01:50 +10:00
Ronnie Sahlberg
1dee7a2401 hide all DELETED nodes from the ctdb command output
(This used to be ctdb commit 91fdfee371d6be83af60cd38ac34afb295b9987a)
2009-06-01 15:43:30 +10:00
Ronnie Sahlberg
5371e3a793 lower the loglevel when we long that we skip an evenscript because it is not executable
(This used to be ctdb commit c265df3c7950aab51b8b6ef17040229b97345c35)
2009-06-01 15:29:36 +10:00
Ronnie Sahlberg
6c0c3577f8 dont try to queue packets for sending to (recently) deleted nodes since these nodes do not have a queue.
(This used to be ctdb commit 1b7c88ae7643f9bcc52b1d33095f97de88fc2316)
2009-06-01 14:56:19 +10:00
Ronnie Sahlberg
8a0880c843 when building the initial vnnmap, make sure to skip any deleted nodes
(This used to be ctdb commit 0cd66c744cd9533ce8d4c4374bcee3bf49b66dae)
2009-06-01 14:44:15 +10:00
Ronnie Sahlberg
dc5e4906cc use num_nodes and the nodes array instead of walking the vnnmap
when counting the number of active nodes

(This used to be ctdb commit df20cd9b05ad9ca72e32ccc42354eafc12b68c04)
2009-06-01 14:39:34 +10:00
Ronnie Sahlberg
e6170b5389 add a new node state : DELETED.
This is used to mark nodes as being DELETED internally in ctdb
so that nodes are not renumbered if / when they are removed from the nodes file.

This is used to be able to do "ctdb reloadnodes" at runtime without
causing nodes to be renumbered.
To do this, instead of deleting a node from the nodes file, just comment it out like

   1.0.0.1
   #1.0.0.2
   1.0.0.3

After removing 1.0.0.2 from the cluster,  the remaining nodes retain their
pnn's from prior to the deletion, namely 0 and 2

Any line in the nodes file that is commented out represents a DELETED pnn

(This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343)
2009-06-01 14:18:34 +10:00
Ronnie Sahlberg
4259156050 dont remove the socket when the dameon stops. This can race if the
service is immediately restarted

(This used to be ctdb commit b18356764cd49d934eab901e596bb75c6e3ecdf8)
2009-05-29 18:16:13 +10:00
Ronnie Sahlberg
6feb7a1bf8 New attempt at TDB transaction nesting allow/disallow.
Make the default be that transaction is not allowed and any attempt to create a nested transaction will fail with TDB_ERR_NESTING.

If an application can cope with transaction nesting and the implicit
semantics of tdb_transaction_commit(), it can enable transaction nesting
by using the TDB_ALLOW_NESTING flag.

(This used to be ctdb commit 3e49e41c21eb8c53084aa8cc7fd3557bdd8eb7b6)
2009-05-25 17:04:42 +10:00
Ronnie Sahlberg
96340bd166 Revert "we only need to have transaction nesting disabled when we start the new transaction for the recovery"
This reverts commit bf8dae63d10498e6b6179bbacdd72f1ff0fc60be.

(This used to be ctdb commit 87292029cb444ffab130ff7dae47a629c2d15787)
2009-05-25 16:55:27 +10:00
Ronnie Sahlberg
270907faec Revert "set the TDB_NO_NESTING flag for the tdb before we start a transaction from within recovery"
This reverts commit 1b2029dbb055ff07367ebc1f307f5241320227b2.

(This used to be ctdb commit 9762a3408f10409b629637d237ec513a825a6059)
2009-05-25 16:55:02 +10:00
Ronnie Sahlberg
c429ca114d Revert "add TDB_NO_NESTING. When this flag is set tdb will not allow any nested transactions and tdb_transaction_start() will implicitely _cancel() any pending transactions before starting any new ones."
This reverts commit 459e4ee135bd1cd24c15e5325906eb4ecfd550ec.

(This used to be ctdb commit f1c6f7dd47bb1081781c0a0d567a92bbbc0aa5d5)
2009-05-25 16:54:25 +10:00
Ronnie Sahlberg
caf0e863a4 remove the obsolete ipmux component.
this is replaced by LVS since a long time

(This used to be ctdb commit dca41ec04788922ce5f4c52d346872b3e35f8cbb)
2009-05-25 12:33:52 +10:00
Ronnie Sahlberg
7b163bca18 fix the git path to the repository
(This used to be ctdb commit b0c32a96f4176747ca772be664888f5c3c483b98)
2009-05-25 12:15:13 +10:00
Ronnie Sahlberg
e85fb3d9c5 install the 31.clamd script as 644 by default
(This used to be ctdb commit e57c47b75fa501223c57040eac73392b42ae549d)
2009-05-25 12:11:07 +10:00
Ronnie Sahlberg
f62b433946 add 31.clamd to the install and the rpm
(This used to be ctdb commit bfc6ac07f8b7b326e75d8c9bf73051a440ee0011)
2009-05-25 12:11:01 +10:00
Ronnie Sahlberg
e999ade7bb From Flavio Carmo Junior <carmo.flavio@gmail.com>
Add an eventscript to manage ClamAV

(This used to be ctdb commit bb4ef6c4d2bc3578bdf4432517e98f85ec94e3b6)
2009-05-25 12:10:29 +10:00
Ronnie Sahlberg
691379b13d From Flavio Carmo Junior <carmo.flavio@gmail.com>
(with modifications)

Add a webpage about CLAMAV support in CTDB

(This used to be ctdb commit 5fc14f98902ae98abed35eaab3b3495226dcac38)
2009-05-25 12:08:50 +10:00
Ronnie Sahlberg
0891024f7a document the new support for ClamAV
(This used to be ctdb commit 39539a2d1784f04245ed7abc84c4f39e1601afa4)
2009-05-25 12:06:09 +10:00
Sumit Bose
887046352d fix re pattern to accept the new recovery lock times in the statistics output
(This used to be ctdb commit ba44aae7307b4fa56f7b2da2cd9d4a7ccd0a135e)
2009-05-25 11:15:00 +10:00
Ronnie Sahlberg
9921e1ec21 change the socket we use for sending grautious ARPs from AF_INET/SOCK_PACKET to AF_PACKET/SOCK_RAW
(This used to be ctdb commit 2c4c20d7803f4449f8d463314c40d4734ec80e2f)
2009-05-21 14:10:45 +10:00
Ronnie Sahlberg
26e1486db7 Whitespace changes and using the CTDB_NO_MEMORY() macro changes to
the previous patch.

(This used to be ctdb commit d623ea7c04daa6349b42d50862843c9f86115488)
2009-05-21 11:49:16 +10:00
Sumit Bose
2fcedf6dac add missing checks on so far ignored return values
Most of these were found during a review by Jim Meyering <meyering@redhat.com>

(This used to be ctdb commit 3aee5ee1deb4a19be3bd3a4ce3abbe09de763344)
2009-05-21 11:22:21 +10:00
Sumit Bose
11988fc77a structure member node_list_file is not used anywhere
(This used to be ctdb commit 0e84ea23d1d998d4d4ac7d8a858b3d8294f056cb)
2009-05-21 11:16:43 +10:00
Sumit Bose
9171a7784c structure member logfile is not used anywhere
(This used to be ctdb commit 4f86c991812c2d0bddbe3de9a9906cf5df118cd4)
2009-05-21 11:15:43 +10:00
Sumit Bose
f13c6e8a2c fix a configure warning while checking for netfilter.h
(This used to be ctdb commit fa5afee8e9a8fba6017bc58f87bc040de7206e63)
2009-05-21 11:14:28 +10:00
Sumit Bose
de36b5012a added a missing dependency
(This used to be ctdb commit 1d833163b57853b84f098dffdb3c5f50164fcc73)
2009-05-21 11:13:42 +10:00
Ronnie Sahlberg
9a3e19658d Change the loglevel of "registered tcp client for ..." to INFO
instead of ERR

(This used to be ctdb commit 92b5580c38c23b99c1692708540983b0c0fcd6cf)
2009-05-19 08:55:42 +10:00
Ronnie Sahlberg
934d8a6b5f From : Flavio Carmo Junior <carmo.flavio@gmail.com>
Add a helper function that checks whether a unix domain socket exists
and there is a daemon LISTENING to it  similar to the existing function
to check for a daemon LISTENING to a tcp/ip socket.

(This used to be ctdb commit 025a836ab3be3c078fccd8c10b10dfffbfdd94d0)
2009-05-19 08:47:19 +10:00
Volker Lendecke
7442461e9f Fix http://ctdb.samba.org/download.html
(This used to be ctdb commit 177295ba400fcaf47f026653f27a42a8ff798d36)
2009-05-19 08:40:00 +10:00
Christian Ambach
8e9736ac1f Remove error messages about a non-existing /var/log/log.ctdb when running ctdb with logging to syslog
(This used to be ctdb commit afdbf3c0df02decd823615134294abf2c8a8a5f3)
2009-05-14 18:59:31 +10:00
Ronnie Sahlberg
0d48af4741 add additional log info to track if/why we cant switch to client mode.
(This used to be ctdb commit 722171fc94a36ffe9e0a5c64502b916fde0a13a4)
2009-05-14 18:25:00 +10:00
Ronnie Sahlberg
98a54c4675 Track how long it takes to take out the recovery lock from both the main dameon and also from the recovery daemon.
Log this in "ctdb statistics".

Also add a varaible "RecLockLatencyMs" that will log an error everytime it takes longer than this to access the reclock file.

(This used to be ctdb commit 042377ed803bb8f7ca9d6ea1a387427b7b8ba45a)
2009-05-14 10:33:25 +10:00
Ronnie Sahlberg
26b37d29b4 new version 1.0.82
(This used to be ctdb commit 82ee458329968001bb03b2aec42e65f532f007b3)
2009-05-14 08:55:40 +10:00
Ronnie Sahlberg
be7137faa9 use scope host when adding the interface to loopback so we dont respond to ARPs for this ip
(This used to be ctdb commit fcd6226a6c00cf657532aa76804bfe029df21ba6)
2009-05-14 08:55:05 +10:00
Ronnie Sahlberg
016b37f1e2 change the prefix NATGW_ to CTDB_NATGW_
(This used to be ctdb commit b7ed7fd4a5fbd344d41caa1afa100b1f24506173)
2009-05-14 08:12:48 +10:00
Michael Adam
60bfafbf10 ping pong: fix logic for mmap reads vs. preads
Michael

(This used to be ctdb commit 0c88fa41bc3c629052bc137ed30c473ed10522fd)
2009-05-13 16:13:14 +10:00
Michael Adam
179d911826 maketarball.sh: add GPL license header
Michael

(This used to be ctdb commit 13270a011016bf20bbf721f6d083b2f113fdbc79)
2009-05-13 16:12:58 +10:00
Michael Adam
b1701e09df makerpms.sh: add GPL license header
Michael

(This used to be ctdb commit 7498e176817719eadd91201bbd0d9ceb91eefdae)
2009-05-13 16:12:41 +10:00
Michael Adam
01fb6e326b Remove generated binary files.
Noted by Mathieu Parent <math.parent@gmail.com>

Michael

(This used to be ctdb commit b321dfd1d23492169ac25ed901d49d7c69ad5340)
2009-05-13 16:01:53 +10:00
Ronnie Sahlberg
d7cefca723 remove NATGW_PRIVATE_IFACE from the documentation since we do not need
it any more.

(This used to be ctdb commit c967b234f59e5998bc8f2250062f4b0d1f39d820)
2009-05-12 18:21:26 +10:00
Ronnie Sahlberg
12400298c1 assign the natgw address to loopback and not the private network so that natgw will still work even when public and private networks are one and the same
(This used to be ctdb commit 2bd796b8a098074502fe20e3ab69098b2109c133)
2009-05-12 18:42:13 +10:00
Ronnie Sahlberg
42891227a4 add extra debug statements to the log to make it easier to see when a recovery dameon has hung due to the underlying filesystem hanging.
(This used to be ctdb commit 5b0067a4e335cbbf6e606646e612d4bfcfdb7441)
2009-05-12 18:39:34 +10:00
Ronnie Sahlberg
93a2829e94 check that a node is banned before trying to unban it.
(This used to be ctdb commit 4467b5f88d749d455854512f60a5d313cafa828b)
2009-05-12 18:32:41 +10:00
Martin Schwenke
2a09b4bad3 In 51_ctdb_bench.sh now allows a 2% difference between positive and
negative.  ctdb_bench.c checks to ensure the timer has advanced from 0
before dividing.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 723413f246399b25166462d2018237920515655f)
2009-05-12 14:45:46 +10:00
Martin Schwenke
d59cd199e4 Avoid floating point divide by 0 in ctdb_fetch.c's bench_fetch().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 3c67e164eb92591f8763883430490805c1dfa9ed)
2009-05-12 14:45:14 +10:00
Martin Schwenke
7c7c5b3489 Bug fixes for tests: simple/12_ctdb_getdebug.sh and scripts/test_wrap.
simple/12_ctdb_getdebug.sh now recognises output with multi-digit node
numbers.

Sharing the ctdb directory via NFS and testing on a real cluster by
setting CTDB_TEST_REAL_CLUSTER didn't work by default.  The fix is to
hack scripts/test_wrap so that it tries to find a valid bin directory
next to the directory containing it is in.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ea2ca769e1d1068fbbad843750b19acfd87360e0)
2009-05-12 14:44:30 +10:00
Ronnie Sahlberg
0dfc35641f From: Sumit Bose <sbose@redhat.com>
fix handling of AC_INIT

(This used to be ctdb commit 1c31fea7432b870169fb839c1fbba5a33dec8e8a)
2009-05-12 08:59:49 +10:00
Martin Schwenke
53c9643104 Fix lvsmaster and natgwlist nodespecs.
They both need to use a -Y option to ctdb and for natgwlist we only
want the 1st line.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e781ff61e17d733349021bb036514f823c7cbfbb)
2009-05-12 08:58:57 +10:00
Martin Schwenke
6cf92b7c0a Updated onnode docs to reflect recent changes.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit cdf092d69a710310d82d1d67baa0ffb19f676927)
2009-05-12 08:58:41 +10:00
Martin Schwenke
6098464175 New lvs/lvsmaster and natgw/natgwlist nodespecs for onnode.
Some code re-factoring to implement this and to make it easy to
implement new ones.  New simpler implementation of echo_nth() no
longer uses deleted get_nth() function.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 29559f5dd099bec210e98909c9b2e048461b7c81)
2009-05-12 08:58:23 +10:00
Martin Schwenke
9616959bd6 New option "-o <prefix>" saves stdout from each node to file <prefix>.<ip>.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a0f5148ac749758e2dfbd6099e829c5bf1d900e6)
2009-05-12 08:58:04 +10:00
Martin Schwenke
9666d7bf0a Use ctdb_fetch_lock rather than ctdb_call.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5f3d2e29a446972ac244e20a08e48f4c2f4ffef4)
2009-05-12 08:55:36 +10:00
Martin Schwenke
86ad711c37 41.httpd event script workaround for RHEL5-ism.
RHEL5 can SIGKILL httpd when stopping it, causing it to leak
semaphores.  This means that eventually a node runs out of semaphores
and httpd can't be started.  So, before we attempt to start httpd we
clean up any semaphores owned by apache.  We also try to restart httpd
in the monitor event if httpd has gone away.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 2d3fbbbb63f443686f9fec42c0bc2058d115806e)
2009-05-12 08:53:32 +10:00
Ronnie Sahlberg
54a5e6c0c8 Add a -Y machinereadable flag to "lvsmaster"
(This used to be ctdb commit bbae698656d5da9a4a5b0fbfc3003844f246d54b)
2009-05-11 14:44:59 +10:00
Ronnie Sahlberg
1ee122e165 in the "lvsmaster" command, return -1 if there is no lvsmaster
(This used to be ctdb commit ce6afbdef36e3c386b75709f73ef55efe0bd1987)
2009-05-11 13:56:28 +10:00
Ronnie Sahlberg
d6e1f04a67 new version 1.0.81
(This used to be ctdb commit a8019f20cd42a1965410fef5bac2c5b73657b38e)
2009-05-08 17:29:57 +10:00
Ronnie Sahlberg
e6e049060f From: Sumit Bose <sbose@redhat.com>
fix handling of AC_INIT and read version from ctdb.spec

(This used to be ctdb commit f7f64f92e26a0757af210d33288162eefcd07d79)
2009-05-06 20:32:39 +10:00
Michael Adam
c544371776 ping_pong: add GPL comment header with Tridge's copyright
Michael

(This used to be ctdb commit a87ef6a9206820d5110a7117240f743af010ff19)
2009-05-06 10:41:18 +10:00
Michael Adam
08cfdf0d63 ping_pong: get pread/pwrite prototypes from unistd.h
by defining _XOPEN_SOURCE to be 500 before including headers

Michael

(This used to be ctdb commit 96c79bddf7895e57ccf90f0d250bd08b7c4daf40)
2009-05-06 10:40:48 +10:00
Michael Adam
d68654ba5e ping_pong: reduce a couple of prototype warnings
Michael

(This used to be ctdb commit fce851621fe2099c9692acfbfaade24c3d69727a)
2009-05-06 10:40:08 +10:00
Michael Adam
73913bb7c5 packaging: also package ping_pong
Michael

(This used to be ctdb commit 300e84f7023e9194b313e96db943e4050bd64e68)
2009-05-06 10:39:47 +10:00
Michael Adam
24b9b6a986 build: also build and install ping_pong
Michael

(This used to be ctdb commit 200de8f299c8fa44d6dc696532f1a947132e7ec4)
2009-05-06 10:39:35 +10:00
Michael Adam
bc6c3d03e8 add tridge's ping_pong.c to the utils folder
Michael

(This used to be ctdb commit fe59ecb697fb4686ad8ea2fe4ec1cc7b4629e74f)
2009-05-06 10:39:19 +10:00
Ronnie Sahlberg
9300933b6a From Sumit Bose <sbose@redhat.com>
add more 64bit plattforms to configure.ac and preserve cli settings

(This used to be ctdb commit 8a86f65826b58c2ee3f07f221a4fc82193beec81)
2009-05-06 10:29:07 +10:00
Andrew Tridgell
5bca205f75 added link to michaels sambaxp papers
(This used to be ctdb commit 48c011188c624f10c9a754d4ead27db558088fd4)
2009-05-06 10:18:34 +10:00
Andrew Tridgell
967947ea80 allow pages in subdirs
(This used to be ctdb commit 68da42c4ee92fcdfe65baf04c1a2d6446583858b)
2009-05-06 10:17:39 +10:00
Andrew Tridgell
2ef63a74f2 more subdir html support
(This used to be ctdb commit 9ce9a500543de4f0aef5e8c28cda9bbc3c9d1b77)
2009-05-06 10:16:54 +10:00
Andrew Tridgell
4f4f03f84a use less intrusive smbstatus call in periodic connections cleanup
(This used to be ctdb commit a152fdc79e3360049aee66c3e628237a91df181f)
2009-05-06 08:20:55 +10:00
root
08492a524b change the talloc hierarchy for the main transaction_start context and the individual transaction_all handles
(This used to be ctdb commit 919b29850671b59bcf748aec25658ea09d8b4f1c)
2009-05-06 07:33:07 +10:00
root
af25fa38f3 fixed a problem with clients disconnecting during a traverse
When a client (such as smbstatus) is killed, it may have outstanding
traverse children on remote nodes. We need to catch the client
disconnect in ctdbd and send a control to all nodes telling them to
kill those outstanding traverse children.

(This used to be ctdb commit f2fb2df4619a14f7f6c11f9132ee7d793028042c)
2009-05-06 07:32:25 +10:00
root
4cef9994a5 new version 1.0.80
(This used to be ctdb commit bf1b76955db6ba00ec64686b53084268573ba6a0)
2009-05-01 12:37:52 +10:00
root
bfea570af4 when tracking the ctdb statistics, only decrement num_clients and pending_calls IFF the counter is >0
Otherwise there is the chance that we will reset the statistics after the counter has been incremented (client connects) to zero   and when the client disconnects we decrement it to a negative number.

this is a pure cosmetic patch with no operational impact to ctdb

(This used to be ctdb commit 72f1c696ee77899f7973878f2568a60d199d4fea)
2009-05-01 12:30:26 +10:00
root
6793f077a8 Add a new variable VerifyRecoveryLock which can be used to disable the test that the recovery daemon holds the lock properly when performing a recovery
(This used to be ctdb commit 329df9e47e6ca8ab5143985a999e68f37c6d88a5)
2009-05-01 01:17:59 +10:00
Ronnie Sahlberg
2e3542b5e5 dont unconditionally kill/restart ctdb when given "service ctdb start" only start ctdb if it is not already running, and print an error message othervise
(This used to be ctdb commit 94343309992929a592348c936e09a7b4f8b512c1)
2009-04-30 17:38:30 +10:00
Ronnie Sahlberg
3a6ace330e we only need to have transaction nesting disabled when we start the new transaction for the recovery
(This used to be ctdb commit bf8dae63d10498e6b6179bbacdd72f1ff0fc60be)
2009-04-26 08:48:15 +10:00
Ronnie Sahlberg
d20bb2498d set the TDB_NO_NESTING flag for the tdb before we start a transaction from within recovery
(This used to be ctdb commit 1b2029dbb055ff07367ebc1f307f5241320227b2)
2009-04-26 08:42:54 +10:00
Ronnie Sahlberg
777c634eae add TDB_NO_NESTING. When this flag is set tdb will not allow any nested transactions and tdb_transaction_start() will implicitely _cancel() any pending transactions before starting any new ones.
(This used to be ctdb commit 459e4ee135bd1cd24c15e5325906eb4ecfd550ec)
2009-04-26 08:38:37 +10:00
Ronnie Sahlberg
38ea6708dd add a tuneable RecoveryDropAllIPs so it is possible to control after how long a node that has been stuck in recovery will wait until it will yield all public addresses.
this now defaults to 60 seconds

This is useful if a split brain occurs due to network partitioning since it will make sure that the "other half" of the cluster that does not contain the recovery master will eventually release all ips and thus avoiding a duplicate ip situation for the public addresses

(This used to be ctdb commit 70f21428c9eec96bcc787be191e7478ad68956dc)
2009-04-24 18:28:08 +10:00
Ronnie Sahlberg
ce3283f7cb increase the loglevel for the message we print when we automatically release all ips when we have been in recovery for too long
(This used to be ctdb commit 7af060ded5113a49832f6a08a942523a202586b3)
2009-04-24 18:11:10 +10:00
Ronnie Sahlberg
3363480da4 tweak some timeouts so that we do trigger a banning even if the control hangs/timesout
(This used to be ctdb commit 1860a365e6ba8212e15c33016c80a2adcf8d10f4)
2009-04-24 14:45:07 +10:00
Ronnie Sahlberg
e5532b6f26 If we can not pull a database from a node during recovery, mark this node as a "culprit" so that it will eventually become banned.
(This used to be ctdb commit 69dc3bf60b86d8df6dc5c7c6ebf303e847fb2ba9)
2009-04-24 14:44:57 +10:00
Andrew Tridgell
37e2417c59 change shutdown level for ctdb to be 01
We want ctdb to shutdown first, as it manages many other
services. With the old level of 32 the NFS service would shutdown
first, and that would trigger ctdb to do a recovery. Then ctdb itself
would be shutdown a few seconds later, which causes a lot of error
messages in the other nodes logs

(This used to be ctdb commit 2f952af1a12e81a652ec9a4794db96f9593f2676)
2009-04-23 11:35:42 +10:00
Ronnie Sahlberg
8752745173 new version 1.0.79
(This used to be ctdb commit 6c900aa343096c5e1e297e055c36832ffa5028dd)
2009-04-08 12:56:52 +10:00
Ronnie Sahlberg
4be3e86405 create a function "remote_ip" which can be used from scripts to remove a single ip from an interface.
use this fucntion from the natgw eventscript

(This used to be ctdb commit feab5f30b2d6cebf4dd28abc5a81f93424a4c852)
2009-04-08 12:49:28 +10:00
Ronnie Sahlberg
976e76f408 set libdir to ../lib64 on x86-64 platforms
(This used to be ctdb commit a9f851caec2525ccbb3a6d6283eaef52b89a4eb2)
2009-04-08 10:45:00 +10:00
Ronnie Sahlberg
62afe2ff71 install ctdb.pc from the RPM
(This used to be ctdb commit 1b47ddc97373376b416a50939b74dc8c926fc917)
2009-04-08 09:34:20 +10:00
Ronnie Sahlberg
0f70c47008 From Mathieu Parent <math.parent@gmail.com>
Install the pkgconfig file

(This used to be ctdb commit 7c4389cc0baa43a0ffa9fb08944c253db7885807)
2009-04-08 09:21:11 +10:00
Mathieu Parent
6efe2b6533 (This used to be ctdb commit b0718551f55d5da9be0e6aba233f57c1ff8509be) 2009-04-08 09:14:20 +10:00
Ronnie Sahlberg
59fd3bd564 install /etc/ctdb/notify.sh as executable.
this addresses bug 6250

(This used to be ctdb commit b8be5b06c3359d037db336dc12d38e0018349951)
2009-04-08 08:48:55 +10:00
Ronnie Sahlberg
a87e6f56ae we only need to switch into client mode from the eventscript child if we are running the monitor event
(This used to be ctdb commit 13e2c9044950f21918e4610726e73ed3d8f76920)
2009-04-06 14:03:09 +10:00
Ronnie Sahlberg
e5e2f6f8f7 increase the listen queue. Now that the eventscripts may become clients and connect back to the server we do get a lot more concurrent connection attempts (takepip/teleaseip are performed in parallell)
(This used to be ctdb commit 018f8b0b1823ef59b46f1a671aec5309d10628f4)
2009-04-06 14:00:41 +10:00
Ronnie Sahlberg
1f87ee85bc use _exit() and not exit() when we terminate a failed eventscript child process
(This used to be ctdb commit 33b296cee177adc61edc911caec8c24b3efa8441)
2009-04-06 13:16:36 +10:00
Ronnie Sahlberg
2e1208e648 We dont need to verify the nodemap on remote nodes that are banned
(This used to be ctdb commit 7f8f9385deee6eff2b7303147bc6412bbdc122df)
2009-04-06 12:00:22 +10:00
Ronnie Sahlberg
2393df3989 if we cant pull the remote nodemap off a node we should mark it as a culprit so it eventually becomes banned.
(This used to be ctdb commit 0889ae3c237bdb3bd72d45f2f64f5e5d8420870c)
2009-04-02 14:50:43 +11:00