1
0
mirror of https://github.com/samba-team/samba.git synced 2025-03-08 04:58:40 +03:00

343 Commits

Author SHA1 Message Date
Ronnie Sahlberg
9c6aa4e420 update the eventscript to ensure that stopped nodes can not become the natgw master
also verify that we actually do have a natgw master available if this is configured and make the node unhealthy if not.

(This used to be ctdb commit 7f273ee769d671d8c8be87c9187302fb77e814f3)
2009-07-17 09:45:05 +10:00
Ronnie Sahlberg
5ce69e2fa3 if all nodes are STOPPED, pick one of the STOPPED nodes as natgw master
(This used to be ctdb commit 8bbd96cfbbe98f3fc19e432797cbf4478f753a0b)
2009-07-17 09:36:22 +10:00
Ronnie Sahlberg
bf9ad9c934 Do not allow STOPPED or DELETED nodes to become the NATGW master
(This used to be ctdb commit 4505ea15408ad40dd8deb4041fd75a65a0ad9336)
2009-07-17 09:29:58 +10:00
Ronnie Sahlberg
88f3c40d9c add two new controls, CTOP_NODE and CONTINUE_NODE
that are used to stop/continue a node instead of using modflags messages

(This used to be ctdb commit 54b4a02053a0f98f8c424e7f658890254023d39a)
2009-07-09 12:22:46 +10:00
Ronnie Sahlberg
d6a5fd5c9d remove the header printed for the machinereadable output for natgwlist
(This used to be ctdb commit 049271c83a09afb8d6c3e5212cf9ca782956b0c6)
2009-07-09 11:43:37 +10:00
Ronnie Sahlberg
9f0dc4b93b Add a new node flag : STOPPED
This node flag means the node is DISABLED and that all its public ip addresses
are failed over, but also that it has been removed from the VNNmap.

A STOPPED node should be in recovery mode active untill restarted using the continue command.

Adding two new commands "ctdb stop" "ctdb continue"

(This used to be ctdb commit d47dab1026deba0554f21282a59bd172209ea066)
2009-07-09 11:38:18 +10:00
Ronnie Sahlberg
20887a15ad Perform an ipreallocate efter each enable/disable.
This will force a wait until the ip addresses have been reallocated after a disable/enable command and will make scripting of enable/disable more predictable.

This will cause the command enable/disable to wait until the ip realocation that normally follows shortly after a enable/disable to finish before the command returns to the prompt.

(This used to be ctdb commit 6e1f60d8d780c1240aaabb78ecc8550d0480cd7e)
2009-07-06 11:49:55 +10:00
Ronnie Sahlberg
289c58e9b6 add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process.
the ctdb command will block until the ip reallocation has comleted

(This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216)
2009-07-02 13:00:26 +10:00
Ronnie Sahlberg
8e435c0605 update enable/disable
(This used to be ctdb commit b99afc98bedf1a51d315e311f27c3fc55fd940e7)
2009-07-01 09:33:08 +10:00
Ronnie Sahlberg
2770cb4397 show the valid debuglevels that can be used in the error text when an invalid level was specified to ctdb setdebug
(This used to be ctdb commit 421c0566094b91221fab2ea68f2c9bd35d5dfbcb)
2009-07-01 09:21:07 +10:00
Ronnie Sahlberg
93026f4cbf update the handling of debug levels so that we always can use a literal instead of a numeric value.
validate the input values used and refuse setting the debug level to an unknown value

(This used to be ctdb commit daec49cea1790bcc64599959faf2159dec2c5929)
2009-07-01 09:17:13 +10:00
Ronnie Sahlberg
9802a0c2f6 when no debuglevel is specified, make 'ctdb setdebug' show the available options
(This used to be ctdb commit f4b0825d9da34578b9f90dc9bd7f99fcc2519ddf)
2009-07-01 08:26:00 +10:00
Ronnie Sahlberg
5b235c3999 add a control to set the reclock file
(This used to be ctdb commit 36cc2e586f03fa497ee9b06f3e6afc80219c4aaa)
2009-06-25 14:25:18 +10:00
Ronnie Sahlberg
2b253c094c add a control to read the current reclock file from a node
(This used to be ctdb commit ed6a4cbcdcbb4e0df83bec8be67c30288bf9bd41)
2009-06-25 12:17:19 +10:00
Martin Schwenke
635da189dc Fix minor onnode bugs relating to local daemons.
Commit a0f5148ac749758e2dfbd6099e829c5bf1d900e6 caused a subtle
regression.  Due to the subtlety, this description is much longer than
the 1 line patch that fixes it!  The regression, where a process that
invokes onnode is unexpectedly blocked, is only apparent if the
following conditions are met:

1. $CTDB_NODES_SOCKETS is set;
2. The command passed to onnode attempts to background a process; and
3. onnode is run in certain types of subshell (e.g. foo=$(onnode ...)).

In particular, when testing against local daemons (i.e. condition (1)
is met), tests/simple/07_ctdb_process_exists.sh would fail (because it
does both (2), (3)).

The problem is caused by the use of file descriptor 3 in the code that
allows separate filtering of stdout and stderr.  A backgrounded
process will have this descriptor open and the $(...) construct
appears to wait for all file descriptors to be closed.  This only
happens with local daemons because SSH is replaced by a shell and file
descriptor 3 leaks into that shell.  It does not occur when SSH is
used because the file descriptor does not leak into the remote shell
where the process is backgrounded.

The fix is simply to redirect file descriptor 3 to /dev/null in the
fakessh function, which is used when $CTDB_NODES_SOCKETS is set.

Also fixed is another minor bug when the -o option and
$CTDB_NODES_SOCKETS are used in combination.  The code uses the node
name as a suffix for the output filename(s).  Usually this is an IP
address.  However, when $CTDB_NODES_SOCKETS is in use the node name is
the socket name, which might be a path several directories deep.
Each output file is created via a simple redirection and this would
fail if unexpected directories appear in the filename.  3 possible
fixes were considered:

1. Replace all '/'s in the node name by '_'s.  Nice and simple.
2. Use the basename of the node name.  However, sockets may be in
   different directories but have the same basename.
3. Create all required directories before redirecting.  This is a
   little more complex and probably doesn't meet the user's
   expectations.

Option (1) is implemented here.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5d320099025b6835eda3a1e431708f7e0a6b0ba6)
2009-06-19 18:02:17 +10:00
Ronnie Sahlberg
2bb687c4cd remove unused variable
(This used to be ctdb commit 2a52336ec021dfe8d56ba72726feb7b2dbd41f68)
2009-06-09 10:58:46 +10:00
Ronnie Sahlberg
ac931b1371 dont require particular values for NoIPFailback and DeterministicIPs when
using ctdb moveip

(This used to be ctdb commit d350c631850377c09968d2978ef57d2bd0d50116)
2009-06-09 10:57:46 +10:00
Ronnie Sahlberg
f135684766 improve ctdb moveip so that it does not always trigger a recovery.
(This used to be ctdb commit 0ca28d7336463ecd2ff65620d8dbcbb496991531)
2009-06-09 10:56:50 +10:00
Ronnie Sahlberg
f6ccf96898 try avoiding to cause a recovery when deleting a public ip from a node
(This used to be ctdb commit 6318ea13464e2fe630084c40802d8e697c2cb999)
2009-06-05 17:57:14 +10:00
Ronnie Sahlberg
b046f5e3aa when adding an ip, try manually adding and takingover the ip instead of triggering a full recovery to do the same thing
(This used to be ctdb commit 4d5d22e64270cfb31be6acd71f4f97ec43df5b2c)
2009-06-05 17:00:47 +10:00
Ronnie Sahlberg
79eef7f2b5 dont list DELETED nodes in the ctdb listnodes output
(This used to be ctdb commit 7eb137aa4c24c69bd93b98fb3c7108e5f3288ebd)
2009-06-04 13:25:58 +10:00
Ronnie Sahlberg
f691b96d84 make it possible to run 'ctdb listnodes' also if the daemon is not running.
in this case, read the nodes file directly instead of asking the local daemon for the list.

add an option -Y to provide machinereadable output to listnodes

(This used to be ctdb commit 4a55cacc4f5526abd2124460b669e633deeda408)
2009-06-04 13:21:25 +10:00
Ronnie Sahlberg
45aa542064 teach ONNODE about deleted nodes
(This used to be ctdb commit 03d304e72a5839dc8d8d2e2312b346c21dca5774)
2009-06-02 15:03:44 +10:00
Ronnie Sahlberg
1dee7a2401 hide all DELETED nodes from the ctdb command output
(This used to be ctdb commit 91fdfee371d6be83af60cd38ac34afb295b9987a)
2009-06-01 15:43:30 +10:00
Ronnie Sahlberg
e6170b5389 add a new node state : DELETED.
This is used to mark nodes as being DELETED internally in ctdb
so that nodes are not renumbered if / when they are removed from the nodes file.

This is used to be able to do "ctdb reloadnodes" at runtime without
causing nodes to be renumbered.
To do this, instead of deleting a node from the nodes file, just comment it out like

   1.0.0.1
   #1.0.0.2
   1.0.0.3

After removing 1.0.0.2 from the cluster,  the remaining nodes retain their
pnn's from prior to the deletion, namely 0 and 2

Any line in the nodes file that is commented out represents a DELETED pnn

(This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343)
2009-06-01 14:18:34 +10:00
Sumit Bose
2fcedf6dac add missing checks on so far ignored return values
Most of these were found during a review by Jim Meyering <meyering@redhat.com>

(This used to be ctdb commit 3aee5ee1deb4a19be3bd3a4ce3abbe09de763344)
2009-05-21 11:22:21 +10:00
Christian Ambach
8e9736ac1f Remove error messages about a non-existing /var/log/log.ctdb when running ctdb with logging to syslog
(This used to be ctdb commit afdbf3c0df02decd823615134294abf2c8a8a5f3)
2009-05-14 18:59:31 +10:00
Ronnie Sahlberg
98a54c4675 Track how long it takes to take out the recovery lock from both the main dameon and also from the recovery daemon.
Log this in "ctdb statistics".

Also add a varaible "RecLockLatencyMs" that will log an error everytime it takes longer than this to access the reclock file.

(This used to be ctdb commit 042377ed803bb8f7ca9d6ea1a387427b7b8ba45a)
2009-05-14 10:33:25 +10:00
Ronnie Sahlberg
93a2829e94 check that a node is banned before trying to unban it.
(This used to be ctdb commit 4467b5f88d749d455854512f60a5d313cafa828b)
2009-05-12 18:32:41 +10:00
Martin Schwenke
53c9643104 Fix lvsmaster and natgwlist nodespecs.
They both need to use a -Y option to ctdb and for natgwlist we only
want the 1st line.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e781ff61e17d733349021bb036514f823c7cbfbb)
2009-05-12 08:58:57 +10:00
Martin Schwenke
6098464175 New lvs/lvsmaster and natgw/natgwlist nodespecs for onnode.
Some code re-factoring to implement this and to make it easy to
implement new ones.  New simpler implementation of echo_nth() no
longer uses deleted get_nth() function.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 29559f5dd099bec210e98909c9b2e048461b7c81)
2009-05-12 08:58:23 +10:00
Martin Schwenke
9616959bd6 New option "-o <prefix>" saves stdout from each node to file <prefix>.<ip>.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a0f5148ac749758e2dfbd6099e829c5bf1d900e6)
2009-05-12 08:58:04 +10:00
Ronnie Sahlberg
54a5e6c0c8 Add a -Y machinereadable flag to "lvsmaster"
(This used to be ctdb commit bbae698656d5da9a4a5b0fbfc3003844f246d54b)
2009-05-11 14:44:59 +10:00
Ronnie Sahlberg
1ee122e165 in the "lvsmaster" command, return -1 if there is no lvsmaster
(This used to be ctdb commit ce6afbdef36e3c386b75709f73ef55efe0bd1987)
2009-05-11 13:56:28 +10:00
Ronnie Sahlberg
6721546b53 change the ctdb command table to allow us to describe commands which can be run independtly of the ctdb daemon.
create a new debugging command xpnn which discovers the pnn of the local node and which works even if the local daemon is not running

(This used to be ctdb commit cd78765f9400d7abce7929a2dd199f65226e7664)
2009-03-25 14:46:05 +11:00
Ronnie Sahlberg
d7ff332896 update how the NATGW configuration works.
allow the cluster to be partitioned into multiple disjoint natgw subsets

(This used to be ctdb commit 1046885cd22b5001e0251de2e536b5f6793459be)
2009-03-25 13:37:57 +11:00
Ronnie Sahlberg
7265c713db we need to set the port properly in the parse_ip helper
(This used to be ctdb commit 43fe18d86995744ba61c7a6405b70edcb265930a)
2009-03-24 13:45:11 +11:00
root
629d5ee1fa add a new command "ctdb scriptstatus"
this command shows which eventscripts were executed during the last monitoring cycle and the status from each eventscript.

If an eventscript timedout or returned an error we also
show the output from the eventscript.

Example :
[root@rcn1 ctdb-git]# ./bin/ctdb scriptstatus
6 scripts were executed last monitoring cycle
00.ctdb              Status:OK    Duration:0.021 Mon Mar 23 19:04:32 2009
10.interface         Status:OK    Duration:0.048 Mon Mar 23 19:04:32 2009
20.multipathd        Status:OK    Duration:0.011 Mon Mar 23 19:04:33 2009
40.vsftpd            Status:OK    Duration:0.011 Mon Mar 23 19:04:33 2009
41.httpd             Status:OK    Duration:0.011 Mon Mar 23 19:04:33 2009
50.samba             Status:ERROR    Duration:0.057 Mon Mar 23 19:04:33 2009
   OUTPUT:ERROR: Samba tcp port 445 is not responding

Add a new helper function "switch_from_server_to_client()" which both
the recovery daemon can use as well as in the child process we start for running the actual eventscripts.

Create several new controls, both for the eventscript child process to inform the master daemon of the current status of the scripts as well as for the ctdb tool to extract this information from the runninc daemon.

(This used to be ctdb commit c98f90ad61c9b1e679116fbed948ddca4111968d)
2009-03-23 19:07:45 +11:00
Ronnie Sahlberg
4d2195c503 The wbinfo --sequence command has been depreciated in favor of the new
--online-status command

(This used to be ctdb commit b6e34503ac094a274a569a69e3d93d92ad911f4d)
2009-03-19 10:43:57 +11:00
root
4088e0aceb make sure we can collect proper mmfs data
(This used to be ctdb commit 76d655f9aa3ebd39e7a40d0bbd85e40d08f3e90b)
2009-03-12 12:33:19 +11:00
root
7a11082f0f collect net conf list in ctdb_diagnostics
(This used to be ctdb commit 0bb130090b8dce5f85b0cb178a19f877759c0caa)
2007-03-10 14:10:21 +11:00
root
b1e7724eb8 check the static-routes file if it exists
(This used to be ctdb commit 9ce84a7915abaa987160ecbcae63128a9ed0a741)
2007-03-10 13:45:38 +11:00
Ronnie Sahlberg
5c7570b103 Merge branch 'martins'
(This used to be ctdb commit fe4eea45c6b5702a794424037c3f2ab4241d5e5e)
2009-02-18 13:10:03 +11:00
Michael Adam
3cca0f75e4 Fix treatment of link local ipv6 addresses: set the scope id.
metze / Michael

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 9d12de1ca6107801dada927729e755c0949d73bf)
2009-01-19 22:50:53 +01:00
Martin Schwenke
9e3ccd9d69 Merge commit 'origin/master' into martins
(This used to be ctdb commit 099a1605574c7a8d232fd4c2d0c65e55aedeafad)
2008-12-17 15:05:44 +11:00
root
6c1359ab0d add better errorchecking that nodes we try to talk to using the "ctdb" tool actually exist and that it is connected.
two new dedicated ctdb error codes
21: node does not exist
22: node is disconnected

(This used to be ctdb commit 7ee6db06162ad5a554058bb6160ad37b24fe42e0)
2008-12-17 14:26:01 +11:00
root
1bf3006665 update the "ctdb recover" command.
block and wait until the clustered has completed the recovery before returning.
this  makes it easier to script since it avoids the common need for
   ctdb recover
   ... complex loop to wait for recovery to complete ...
   script continues

(This used to be ctdb commit 8a0df9324a03b0f17772c64a9331236126c22124)
2008-12-10 12:06:51 +11:00
root
1209079672 add a CTDB_TIMEOUT variable for the ctdb tool.
If set this specified the maximum runtime for the ctdb tool before it will terminate with status == 20
Just like the -T ...  option would.

(This used to be ctdb commit c404d57afb2adda039e676877838927d3073df11)
2008-12-10 12:01:19 +11:00
root
58bf3804f0 make sure we return an errorcode when the ctdb command has hung and is timeodout by the -T <timeout> setting
(This used to be ctdb commit 993f626e603b9bbc02942bb55096d63b9a4f456b)
2008-12-10 11:49:51 +11:00
Martin Schwenke
5dcc100e3e Merge commit 'origin/master' into martins
(This used to be ctdb commit 674d1660e5602f2fab1eaf219a6b8b5ddf24c402)
2008-12-10 11:42:02 +11:00