1
0
mirror of https://github.com/samba-team/samba.git synced 2025-02-11 17:58:16 +03:00

459 Commits

Author SHA1 Message Date
Ronnie Sahlberg
6f1221e9e1 Add the number of performed recoveries to the "ctdb statistics" output.
(This used to be ctdb commit fa045733cb81412f0d02ab52d74eabc7efca8b3d)
2010-05-11 09:44:53 +10:00
Ronnie Sahlberg
4a43428440 The recent change to the recovery daemon to keep track of and
verify that all nodes agree on the most recent ip address assignments
broke "ctdb moveip ..." since that call would never trigger
a full takeover run and thus would immediately trigger an inconsistency.

Add a new message to the recovery daemon where we can tell the recovery daemon to update its assignments.

BZ62782

(This used to be ctdb commit e7069082e5f0380dcddee247db8754218ce18cab)
2010-05-03 15:47:17 +10:00
Ronnie Sahlberg
05dcbed90e ctdb regsrvids is much more useful for testing if it sleeps once it has registered its srvid.
Othervise, as soon as it terminates, ctdbd will deregister the id automatically.

(This used to be ctdb commit 23b059dcb8074872d7900b225790d4df7da071b6)
2010-02-22 15:34:26 +11:00
Ronnie Sahlberg
e01c8454ef commands that relate to manual failover of ip addresses (moveip)
can sometimes take long so allow for a longer timeout for the controls used.

(This used to be ctdb commit 144c69b633eeb17e120f962162feed6de3dc16a6)
2010-02-09 18:34:47 +11:00
Ronnie Sahlberg
ca9386a7f4 dont just exit(0) upon successful completion of waiting for an ipreallocate to finish.
return success back to the caller instead.

otherwise things like 'ctdb enable -n all' will just finish after the first disabled node has become enabled.

(This used to be ctdb commit f4eb41cd3a1099da8265351818fba9bd4688a188)
2010-02-09 14:35:10 +11:00
Ronnie Sahlberg
7a889c5f1d When trying to enable/disable a node.
Check if the node is already enabled/disabled and log an information
message if so.

(This used to be ctdb commit c3eec8f10764a647106087099eeb47b7196f7aac)
2010-02-04 10:03:21 +11:00
Ronnie Sahlberg
7a5254ae69 add two new debug controls to send and receive messages
ctdb msglisten and msgsend

(This used to be ctdb commit 8c89aac20260dc7f3746e29fe99f17422a77cb88)
2010-02-04 09:45:32 +11:00
Martin Schwenke
52dbd65825 onnode: update algorithm for finding nodes file.
2 changes:

* If a relative nodes file is specified via -f or $CTDB_NODES_FILE but
  this file does not exist then try looking for the file in /etc/ctdb
  (or $CTDB_BASE if set).

* If a nodes file is specified via -f or $CTDB_NODES_FILE but this
  file does not exist (even when checked as per above) then do not
  fall back to /etc/ctdb/nodes ((or $CTDB_BASE if set).  The old
  behaviour was surprising and hid errors.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 60aa570aaa77d293b963105b3f605f9625a4594b)
2010-01-21 18:52:44 +11:00
Martin Schwenke
7569b21f2d onnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 503e4908b3028330bc25dc6de8561dbd53ee6a8d)
2010-01-21 18:52:31 +11:00
Stefan Metzmacher
f2854f75c8 tools/ctdb: add PartiallyOnline state for "ctdb status" and "ctdb status -Y"
This is based on the GET_IFACES control against each node.

metze

(This used to be ctdb commit 38cb972382a09f830673277d0a9bd5d20deafff2)
2010-01-20 11:11:00 +01:00
Stefan Metzmacher
a6437bc707 tools/ctdb: display interfaces in "ctdb ip" and "ctdb ip -Y" outputs
metze

(This used to be ctdb commit dffa2b05acce8b73c2fdd085311732bf57f01b7f)
2010-01-20 11:11:00 +01:00
Stefan Metzmacher
df5805d6a0 tools/ctdb: add "ctdb ipinfo <ip>"
metze

(This used to be ctdb commit e05e236fc019bfd3b316609a7c190e0e028a4bbc)
2010-01-20 11:11:00 +01:00
Stefan Metzmacher
a6803f42a5 tools/ctdb: add "ctdb setifacelink <iface> <status>"
metze

(This used to be ctdb commit 8d0c00b60db69bd10f12da4c676e1142dc37af7a)
2010-01-20 11:11:00 +01:00
Stefan Metzmacher
0ceef7036b tools/ctdb: add "ctdb ifaces"
metze

(This used to be ctdb commit 80053d09eed967fb76898f4a53437bed2b43a02f)
2010-01-20 11:11:00 +01:00
Stefan Metzmacher
a23c409e73 tools/ctdb: display INACTIVE status in "ctdb status" and "ctdb status -Y"
metze

(This used to be ctdb commit 18af37e99ef8ff5623161495be432abfe5e3407f)
2010-01-20 09:44:36 +01:00
Stefan Metzmacher
a03cf0040b ctdb: print out some hints how to debug a "ctdb catdb" failure
metze

(This used to be ctdb commit 504cf78d00d1120b556124340b9312f890b8b8b9)
2009-12-16 08:08:33 +01:00
Stefan Metzmacher
965c000c6e ctdb: add machinereadable output fot "ctdb -Y getdbmap"
metze

(This used to be ctdb commit 45cfcd44093c7d2681e2ffd5cfb402823e8809f4)
2009-12-16 08:08:33 +01:00
Stefan Metzmacher
aa07a46bf5 ctdb: disallow "ctdb backupdb" on unhealthy databases
metze

(This used to be ctdb commit ecf799093c1989f5499c9d61ce8cc8a98d759160)
2009-12-16 08:08:33 +01:00
Stefan Metzmacher
c4bc231267 client: add "ctdb dumpdbbackup <filename>"
metze

(This used to be ctdb commit c63a0368d9d4b526ac1e49d891d3a1b7b8d20320)
2009-12-16 08:08:33 +01:00
Stefan Metzmacher
fb50e08942 tools/ctdb: let "ctdb restoredb" and "ctdb wipedb" mark the db as healthy on all
nodes

metze

(This used to be ctdb commit d1b10b0c0c323c39742a18e98a1dab7e82ddc7be)
2009-12-16 08:08:32 +01:00
Stefan Metzmacher
c56ce3d2f2 tools/ctdb: add "ctdb getdbstatus <dbname>"
metze

(This used to be ctdb commit 910c19f12448d293a755d1eb46d20f9591f8da7a)
2009-12-16 08:08:32 +01:00
Stefan Metzmacher
927dd3d9e5 tools/ctdb: display db health in "ctdb getdbmap"
metze

(This used to be ctdb commit c34535ff4dc6a44909283641596e0ed7c2316fbd)
2009-12-16 08:08:32 +01:00
Stefan Metzmacher
003985acfd ctdb: pass TDB_DISALLOW_NESTING to all tdb_open/tdb_wrap_open calls
metze

Signed-off-by: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit 1635e931b909c66eb3b1f5357e3a549b1a0da70d)
2009-12-16 08:03:55 +01:00
Rusty Russell
cab8da8dc4 ctdb: don't print OUTPUT: for DISABLED scripts
In other news, did you know ctime() returns a \n-terminated string?

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 1b4e7bb548976b99f122142b040494b6f9911962)
2009-12-14 15:46:49 +11:00
Rusty Russell
a46c3b4f2a ctdb: scriptstatus can now query non-monitor events
We also no longer return an error before scripts have been run; a special
zero-length data means we have never run the scripts.

"ctdb scriptstatus all" returns all event script results.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 9b90d671581e390e2892d3a68f3ca98d58bef4df)
2009-12-08 01:50:55 +10:30
Rusty Russell
9e87377e7a ctdb: support --machinereadable (-Y) for scriptstatus
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 47ffe75848f216568ce3db0a60ca88cfe3d6903a)
2009-12-08 01:31:53 +10:30
Rusty Russell
9753b7e793 eventscript: rename ctdb_monitoring_wire to ctdb_scripts_wire
We're going to allow fetching status of all script runs, so this
name is no longer appropriate.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit f5cb41ecf3fa986b8af243e8546eb3b985cd902a)
2009-12-08 00:51:24 +10:30
Rusty Russell
c70afe0cd4 eventscript: handle and report generic stat/execution errors
Rather than ignoring deleted event scripts (or pretending that they were "OK"),
and discarding other stat errors, we save the errno and turn it into a negative
status.

This gives us a bit more information if we can't execute a script (eg.
too many symlinks or other weird errors).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 5d894e1ae5228df6bbe4fc305ccba19803fa3798)
2009-12-07 23:12:19 +10:30
Rusty Russell
b9b75bd065 eventscript: use -ENOEXEC for disabled status value
This unifies code paths and simplifies things: we just hand -ENOEXEC to
ctdb_ctrl_event_script_stop().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit eadf5e44ef97d7703a7d3bce0e7ea0f21cb11f14)
2009-12-07 23:11:47 +10:30
Rusty Russell
066a791770 eventscript: use -ETIME for timeout status value
This starts the move toward more expressive encoding of return values:
positive values mean the script ran, negative means we had a problem with
the script (and the value is the errno).

This does timeout, but changes the ctdb tool to recognize it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 0eb1d0aa14e68b598d9e281c8a02b8f94a042fd9)
2009-12-07 23:09:42 +10:30
Michael Adam
92c5d9eefc ctdb: add command "ctdb wipedb" to wipe the contents of an attached tdb
Michael

(This used to be ctdb commit 5a7c1e7f15693522bbf1c39a53be2304ece9a134)
2009-12-04 11:30:20 +01:00
Ronnie Sahlberg
cc2d81a77c make the ringbuffer logging more efficient and marshall the data by writing to a tmpfile instead of continously talloc resizing a blob
(This used to be ctdb commit 6427f0b68d60b556a023f64e15e156000ba6f943)
2009-11-18 19:10:50 +11:00
Ronnie Sahlberg
bc2675119d add an in memory ringbuffer where we store the last 500000 log entries regardless of log level.
add commandt to extract this in memory buffer and to clear it

(This used to be ctdb commit 29d2ee8d9c6c6f36b2334480f646d6db209f370e)
2009-11-18 12:44:18 +11:00
Ronnie Sahlberg
f88fbb5f1e suggestion from Christian,
dont allow UNHEALTHY nodes to become natgw master, unless all nodes
are unhealthy

(This used to be ctdb commit e8e7129ff1371065fbd75e1aea844d6d04a96fa9)
2009-11-06 08:19:32 +11:00
Ronnie Sahlberg
fcd2ebc32b update the uptime command to indicate that time since last is either from alst recovery or from last failover
(This used to be ctdb commit 467da12a785ba3367ed9cbdf79440394e9703289)
2009-10-29 10:58:14 +11:00
Ronnie Sahlberg
023d09cd38 Revert "update the "uptime" command to indicate the "time since last" is the time since the last recovery OR failover."
This reverts commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36.

(This used to be ctdb commit cb36bbb5418290e8e5b770d2d836285b15da2a6f)
2009-10-29 10:49:00 +11:00
Ronnie Sahlberg
279b7ca564 update the "uptime" command to indicate the "time since last" is the time since the last recovery OR failover.
(This used to be ctdb commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36)
2009-10-29 10:37:10 +11:00
Ronnie Sahlberg
4d40b86805 for debugging
add a global variable holding the pid of the main daemon.
change the tracking of time() in the event loop to only check/warn when called from the main daemon

(This used to be ctdb commit a10fc51f4c30e85ada6d4b7347b0f9a8ebc76637)
2009-10-27 13:18:52 +11:00
Stefan Metzmacher
3d713d9e53 ctdb_diagnostics: don't use hardcoded path to iptables
All event scripts use only the relative path, so we should
here.

Also PATH includes /sbin and /usr/sbin...

metze

(This used to be ctdb commit 20678e1506db1f96b58c326ee91339e797c07c22)
2009-10-26 14:23:09 +11:00
Ronnie Sahlberg
d08e3c628d Merge commit 'martins/onnode_options'
(This used to be ctdb commit 82fad66123c1b8c5d4ed3b19c39acf6f367b3f37)
2009-10-14 15:51:57 +11:00
Martin Schwenke
f0dd32e412 Merge commit 'origin/master' into onnode_options
(This used to be ctdb commit e62928f56ce8927b1d8686db2c31538c86462d1a)
2009-10-14 13:49:30 +11:00
Martin Schwenke
787a6e44c6 New onnode options: -f to specify nodes file, -n to allow use of hostnames.
The -f option allows an alternate nodes file to be specified,
overriding the CTDB_NODES_FILE environment variable.

The -n option allows hostnames to be used instead of node numbers.
Using a range of hostnames is invalid, so hostnames can't contain
hyphens ('-') - sorry!  You can use this option without a nodes file
by specifying "-f /dev/null".

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 46474e5f21fd97dd765c616647ff46055a9970e7)
2009-10-14 13:44:57 +11:00
Ronnie Sahlberg
80be59d35e when we change state between healthy/unhealthy, make sure we ask the recovery
master to perform an explicit ip reallocation.

This is more reliable and faster than having the recovery dameon track these
changes, and since we now have an explicit method to ask the recovery daemon
to perform an explicit ip reallocation, we should use this.

(This used to be ctdb commit 3807681e74f4bfe92befdae6ed616ff5f1a99880)
2009-10-14 11:59:16 +11:00
Ronnie Sahlberg
98b5caf003 we must break the loop as soon as we find a suitable recmaster does exist
otherwise "tdb ipreallocate" will silently fail to update the addresses.

(This used to be ctdb commit 346fa055f4106497b87df97da5ebd6e51fa1ef8c)
2009-10-13 09:49:05 +11:00
Ronnie Sahlberg
771802b212 allow setting the recmode even when not completely frozen.
we sometimes have to do this when we want to trigger a recovery

(This used to be ctdb commit 46194e87e189521375b39b4ef33da2b493429fd8)
2009-10-12 13:06:16 +11:00
Ronnie Sahlberg
d4c98516a2 uptade the freeze/thaw commands to be able to send the requested database priority to freeze/thaw to the daemon.
this is encoded in the srvid field of the request header

(This used to be ctdb commit 0cb3d33caa42ed783e03bc825b181dde4cf63616)
2009-10-12 09:22:17 +11:00
Ronnie Sahlberg
3219f81710 add a control to read the db priority from a database
(This used to be ctdb commit ca6d045e419f308f57e74d4c978907afb05ddb85)
2009-10-10 15:04:18 +11:00
Ronnie Sahlberg
6cf7d8e131 add a control to set a database priority. Let newly created databases default to priority 1.
database priorities will be used to control in which order databases are locked during recovery in.

(This used to be ctdb commit 67741c0ee01916d94cace8e9462ef02507e06078)
2009-10-10 14:26:09 +11:00
Ronnie Sahlberg
134ed842fa always send the release/take ip controls to make sure all nodes are updated
(This used to be ctdb commit 789703ea684717781c176fd3a2a24d96abde220b)
2009-10-06 12:25:44 +11:00
Ronnie Sahlberg
166b1c97b4 add a new message to ask the recovery daemon to temporarily disable checking ip address consistency.
This is useful when we are moving addresses using moveip in the cluster since otherwise if we collide with the recovery daemons own check we could cause a recovery

(This used to be ctdb commit 9c63858c0b22c81eaccb9865a414af0bbb2833d4)
2009-10-06 12:11:32 +11:00