1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-12 09:18:10 +03:00
Commit Graph

3204 Commits

Author SHA1 Message Date
Rusty Russell
ac90f15424 tdb: fix non-WAF build, commit 1.2.6 ABI file.
Sorry Jeremy.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 7db9838cb5af0d334efbbcb96bfa51d19b35941a)
2010-10-07 15:17:43 +10:30
Rusty Russell
74b2eacede tdb: TDB_INCOMPATIBLE_HASH, to allow safe changing of default hash.
This flag to tdb_open/tdb_open_ex effects creation of a new database:
1) Uses the Jenkins lookup3 hash instead of the old gdbm hash if none is
   specified,
2) Places a non-zero field in header->rwlocks, so older versions of TDB will
   refuse to open it.

This means that the caller (ie Samba) can set this flag to safely
change the hash function.  Versions of TDB from this one on will either
use the correct hash or refuse to open (if a different hash is specified).
Older TDB versions will see the nonzero rwlocks field and refuse to open
it under any conditions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit dd86b24ae5307fe09d4ae22b7070d747013a2b07)
2010-10-07 15:17:38 +10:30
Rusty Russell
6a4c5c8e71 tdb: automatically identify Jenkins hash tdbs
If the caller to tdb_open_ex() doesn't specify a hash, and tdb_old_hash
doesn't match, try tdb_jenkins_hash.

This was Metze's idea: it makes life simpler, especially with the upcoming
TDB_INCOMPATIBLE_HASH flag.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 3f7ed2b46cb304d553d3f7bd34554d695b8ccc52)
2010-10-07 15:17:18 +10:30
Rusty Russell
9faf6888d4 tdb: add Bob Jenkins lookup3 hash as helper hash.
This is a better hash than the default: shipping it with tdb makes it easy
for callers to use it as the hash by passing it to tdb_open_ex().

This version taken from CCAN and modified, which took it from
http://www.burtleburtle.net/bob/c/lookup3.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 58c9d90c758aa7c062d84ab97f62947190526356)
2010-10-07 15:17:05 +10:30
Volker Lendecke
140592fe13 tdb: add restore
Based on an idea by Simon McVittie, largely rewritten

(This used to be ctdb commit 7cda5507f90d7598d745a1acfc66c2afa73cd4b5)
2010-10-07 15:15:45 +10:30
Günther Deschner
1cd12ba7f0 lib/tdb: fix c++ build warning in tdb_header_hash().
Guenther

(This used to be ctdb commit e34e639c214b010ff18140b769a8c9245c92006f)
2010-10-07 15:11:53 +10:30
Jelmer Vernooij
4892ad57de pytdb: Make filename argument optional.
(This used to be ctdb commit 3cc73c51caff51e0cba688aefd6f37e632c0e8d4)
2010-10-07 15:11:48 +10:30
Kirill Smelkov
665b3fbaa7 pytdb: Add support for tdb_freelist_size()
Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>

(This used to be ctdb commit dcdd83e6d6786f0857acdf9aa04bca74a7ccf14d)
2010-10-07 15:11:26 +10:30
Kirill Smelkov
f1a720f08f pytdb: Add support for tdb_transaction_prepare_commit()
Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>

(This used to be ctdb commit fd16bcc1434841d84fdf78f80163c97c0b52b3fe)
2010-10-07 15:11:24 +10:30
Kirill Smelkov
7b88df14e0 pytdb: Add support for tdb_enable_seqnum, tdb_get_seqnum and tdb_increment_seqnum_nonblock
Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>

(This used to be ctdb commit 1778fd02eec6e64737167c46173c0c76c85cc4d9)
2010-10-07 15:11:23 +10:30
Kirill Smelkov
2473345d46 pytdb: Update open flags to match those for tdb_open() in tdb.h
Namely TDB_NOSYNC, TDB_SEQNUM, TDB_VOLATILE, TDB_ALLOW_NESTING and
TDB_DISALLOW_NESTING were missing.

Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>

(This used to be ctdb commit d0c28ff1fedd27a99a7550fcc74e18cb1f536986)
2010-10-07 15:11:21 +10:30
Kirill Smelkov
95386b0283 pytdb: Fix repr segfault for internal db
The problem was tdb->name is NULL for TDB_INTERNAL databases, and
so it was crashing ...

    #0  0xb76944f3 in strlen () from /lib/i686/cmov/libc.so.6
    #1  0x0809862b in PyString_FromFormatV (format=0xb72b6a26 "Tdb('%s')", vargs=0xbfc26a94 "")
        at ../Objects/stringobject.c:211
    #2  0x08098888 in PyString_FromFormat (format=0xb72b6a26 "Tdb('%s')") at ../Objects/stringobject.c:358
    #3  0xb72b65f2 in tdb_object_repr (self=0xb759e060) at ./pytdb.c:439

Cc: 597089@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>

(This used to be ctdb commit 3ff413baf04ce28eb54a80141250ae1284b2a521)
2010-10-07 15:11:14 +10:30
Kirill Smelkov
7c72d220b5 pytdb: Add support for tdb_add_flags() & tdb_remove_flags()
Note, unlike tdb_open where flags is `int', tdb_{add,remove}_flags want
flags as `unsigned', so instead of "i" I used "I" in PyArg_ParseTuple.

Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>

(This used to be ctdb commit 7389f8a8a634c2fe0f068831326d92e6bfa0d046)
2010-10-07 15:08:13 +10:30
Andrew Tridgell
bf8dfcfe7d tdb: added TDB_NO_FSYNC env variable
this might help reduce test times and load on test machines

(This used to be ctdb commit 5c4240c364c52073ca64fddf2aa2c1593db0093b)
2010-10-07 15:08:04 +10:30
Rusty Russell
bc2094c9ca tdb: increment version to 1.2.4
(This used to be ctdb commit f1c06608245ec34493c330d891e04c250ad64b20)
2010-10-07 15:07:22 +10:30
Rusty Russell
fc47015894 tdb: put example hashes into header, so we notice incorrect hash_fn.
This is Stefan Metzmacher <metze@samba.org>'s patch with minor changes:
1) Use the TDB_MAGIC constant so both hashes aren't of strings.
2) Check the hash in tdb_check (paranoia, really).
3) Additional check in the (unlikely!) case where both examples hash to 0.
4) Cosmetic changes to var names and complaint message.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 63c582c99128c3623e270e8425966cab7744fb2f)
2010-10-07 15:05:59 +10:30
Rusty Russell
ef329186d6 tdb: fix tdb_check() on other-endian tdbs.
We must not endian-convert the magic string, just the rest.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 525390863ad39acea08ceb88531dc59d118fcad4)
2010-10-07 15:05:58 +10:30
Rusty Russell
05da60f770 tdb: fix tdb_check() on read-only TDBs to actually work.
Commit bc1c82ea13 "Fix tdb_check() to work with read-only tdb databases."
claimed to do this, but tdb_lockall_read() fails on read-only databases.

Also make sure we can still do tdb_check() inside a transaction (weird,
but we previously allowed it so don't break the API).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 2558eb250011893d09dbeaedaffeefa0e397142f)
2010-10-07 15:05:56 +10:30
Rusty Russell
3bd7dd8bd8 tdb: make check more robust against recovery failures.
We can end up with dead areas when we die during transaction commit;
tdb_check() fails on such a (valid) database.

This is particularly noticable now we no longer truncate on recovery;
if the recovery area was at the end of the file we used to remove it
that way.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit b4162a95ff9ae28cda8d9c76c51c9480104517a7)
2010-10-07 15:05:55 +10:30
Ronnie Sahlberg
f98ffde65a Dont log a normal vacuuming message about a missing record and using default vacuuming intervals as an error.
This is normal for a new system until the vacuuming has been initialized.

(This used to be ctdb commit ffd5fdd23b1cb07078759a78cd1d884f92aa4851)
2010-10-07 14:40:24 +11:00
Ronnie Sahlberg
b67754fa4d when printing machinereadable statistics only print the header with the fieldnames once
(This used to be ctdb commit 70c8d429d7c13cbbd08184ff8f0aa506de5adccc)
2010-09-30 15:08:12 +10:00
Ronnie Sahlberg
1a716ec300 add a machinereadable version of ctdb stats/statistics
(This used to be ctdb commit 3a033156c48d821d48fd18f12c3b0ac14bbddc93)
2010-09-30 15:01:08 +10:00
Ronnie Sahlberg
3ba7ac13eb Create a tunable for how often to collect rolling statistics and initialize it to 1 second
(This used to be ctdb commit cb8c779bb5d9862abbe08919aa181a1a1b2bef18)
2010-09-30 15:00:57 +10:00
Ronnie Sahlberg
9f66a93f12 Add rolling statistics that are collected across 10 second intervals.
Add a new command "ctdb stats [num]" that prints the [num] most recent statistics intervals collected.

(This used to be ctdb commit e6e16fcd5a45ebd3739a8160c8fb5f44494edb9e)
2010-09-29 12:14:45 +10:00
Ronnie Sahlberg
41b6e09fb1 Add a new statistics structure to keep the current running statistics
(This used to be ctdb commit 09e5a2fb47c312f71f455cdbf8d9cabcca1041a4)
2010-09-29 12:14:35 +10:00
Ronnie Sahlberg
39c367a68f Create macros to update the statistics counters and use these macros
everywhere instead of manipulating the coutenrs directly.

(This used to be ctdb commit 2e648df890e5713bc575965d87937827b068d0d7)
2010-09-29 12:14:24 +10:00
Ronnie Sahlberg
869242a7cd Add back monitoring for time skips, forward as well as backward.
This serviceability tool was lost during the migration from the old eventsystem to the tevent system.

(This used to be ctdb commit b4c00b4ac30ec215629f44f802ce9660abcd7a48)
2010-09-28 08:59:35 +10:00
Ronnie Sahlberg
107d020cfa update/improve the log message related to rerecovery timeouts
(This used to be ctdb commit 8b4d1df3abcae03cf7a339d8390c816682a43019)
2010-09-28 08:47:12 +10:00
Ronnie Sahlberg
c6e20a06c7 set up a handler to catch and log debug messages from the tevent layer
(This used to be ctdb commit fdb4c02f595fa207310a9a48da3fefd653fa9e4b)
2010-09-28 08:30:26 +10:00
Ronnie Sahlberg
22ea35f17d adda GETPUBLICIPS control to libctdb and use this in the test example
enhance the test example to show the new releaseip/takeip messages

(This used to be ctdb commit 21cc57883e6c02b0e037211b26d1d866d5d7f03d)
2010-09-15 14:58:11 +10:00
Stefan Metzmacher
0b5bd411ca server/banning: also release all ips if we're banning ourself
metze

(This used to be ctdb commit c386f2c62f06f1c60047b7d4b1ec7a9eec11873c)
2010-09-14 15:50:31 +10:00
Stefan Metzmacher
5e46150490 server/recoverd: if we can't get the recovery lock, ban ourself
metze

(This used to be ctdb commit 80b8889267339b870868841ff077e850bc5b52e2)
2010-09-14 15:49:01 +10:00
Stefan Metzmacher
ff77985f38 server/recoverd: do takeover_run after verifying the reclock file
metze

(This used to be ctdb commit 93df096773c89f21f77b3bcf9aa90bf28881b852)
2010-09-14 15:48:37 +10:00
Stefan Metzmacher
96ddf2f607 server/monitor: ask for a takeoverrun after propagating our new flags
metze

(This used to be ctdb commit 942f44123350d4d0c4ad7f3fcd5ff2d0d175739b)
2010-09-14 15:48:10 +10:00
Ronnie Sahlberg
d8d8b9e1d7 add a new serverid to send a message everytime an ip address is taken on the local node
(This used to be ctdb commit 1261f3d9702800a4e59550c881350daf479f00ef)
2010-09-13 15:43:19 +10:00
Ronnie Sahlberg
991a6ae2a0 Update the comment for the range reserved for SAMBA and
define a new symbol to represent this range similarly to NFSD and ISCSID

Keep the old symbol name to be backward compatible with software using
these headers.

(This used to be ctdb commit 2ce34e50d057ba95249117a581658a5ad7e8eb60)
2010-09-13 15:10:36 +10:00
Ronnie Sahlberg
09a08b0da3 define and reserve a range of ctdb message ports for use by nfs and iscsi servers
(This used to be ctdb commit 84a44ac8ee74dd7af15e378c6cafbedb95feec60)
2010-09-13 15:10:24 +10:00
Ronnie Sahlberg
65382a59d1 Add two new server types to the server_id structure.
NFSD and ISCSID for now.

(This used to be ctdb commit 4cd4bab68f0ba0305a585a2aabcb6871cdb11d96)
2010-09-13 15:10:12 +10:00
Ronnie Sahlberg
a2c874bd61 Implement a new function GETNODEMAP in libctdb.
This function returns a pointer to a nodemap structure.

The returned structure must later be freed by calling ctdb_free_nodemap().

Move the definition of ctdb_sock_addr from ctdb_client.h to ctdb_protocol.h

Move the definition of the node flags, ctdb_node_and_flags and ctdb_node_map from ctdb_private.h to ctdb_protocol.h

Add both sync and async example for ctdb_getnodemap to the test application libctdb/tst.c

(This used to be ctdb commit 31c10eb2b337fd7d8a97a1f9e69b0e7570fec71d)
2010-09-13 14:32:11 +10:00
Ronnie Sahlberg
19211f99c8 remove an unused variable
(This used to be ctdb commit e07fdbaf12bbe84370bc47a1979fe198a06a6cc8)
2010-09-13 13:13:12 +10:00
Ronnie Sahlberg
bb22ff0f50 Dont try to read the nodemap from the daemon for "ctdb listnodes"
Always read it from the /etc/ctdb/nodes file

(This used to be ctdb commit a0fdb25bb2cac177cdc32b938fa08fd665aa873e)
2010-09-09 07:38:28 +10:00
Ronnie Sahlberg
f5c0539dc6 Change how NATGW is configured to allow special nodes that do not have
network connectivity outside of the cluster to still be able to
participate in a natgw group.
These nodes can not become natgw master since they lack external network
connectivity.

These nodes are configured just the same way as for any other node with
NATGW, with the following two exceptions :
* we do NOT set CTDB_NATGW_PUBLIC_IFACE at all on these nodes.
  since these ndoes lack external network we should not check the interface
  for link.
* we must set CTDB_NATGW_SLAVE_ONLY=yes to flag that this is a node that
  can not become natgw master.

(This used to be ctdb commit ab7b00a37e55beffc074be95b55d8a5c7cb9eef2)
2010-09-08 09:20:16 +10:00
Ronnie Sahlberg
dc2f87737d Dont store temporary runtime data in $CTDB_BASE/state
since that will usually be /etc/ctdb/state and storing this under /etc is just
wrong.

Add a new variable CTDB_VARDIR that defaults to /var/ctdb and store the data there instead.

(This used to be ctdb commit 516423c25afa9861d9988096efa8a4a2b12b31b1)
2010-09-03 12:43:28 +10:00
Ronnie Sahlberg
7c682dda59 When memory allocations for recovery fails,
dont dereference a null pointer while trying to print the log message for the failure.

also shutdown ctdb with ctdb_fatal()

(This used to be ctdb commit f8642d0438c6bbb34a72c25d6a904b626e247410)
2010-09-03 12:00:48 +10:00
Harald Klatte
f3078b1c7f AIX bind wants the correct addrsize
(This used to be ctdb commit b5169e037fe113a5b62f510646b8fefc055c053b)
2010-09-03 11:49:19 +10:00
Ronnie Sahlberg
c7df27e32d make sure all statd state directories exist before we try to reference them
or else tar and friends will throw an error in the log

(This used to be ctdb commit 96cbd2c0aa9a4641a42b3c33374675fa732ed1e5)
2010-09-01 15:49:57 +10:00
Ronnie Sahlberg
8be5bf1567 dont print a lot of log information about shutting down vsftpd
(This used to be ctdb commit 1a41cd7332703629001201eea8ae9b94f1341c9d)
2010-09-01 13:29:38 +10:00
Ronnie Sahlberg
9ef21f1c07 ouch, remove a dummy debug printout that snuck in there somehow
(This used to be ctdb commit 14c4d99513b4bdb94f60c3e9c4823e04b0833e60)
2010-08-30 19:48:41 +10:00
Ronnie Sahlberg
8d12313d6b ouch, the ordering of the constants and the strings must be kept in sync
manually   and ther eis no check for errors.     should fix this later

(This used to be ctdb commit e824af1a41f8ceec1edf6b3d1d6e1758fa00deb2)
2010-08-30 19:43:35 +10:00
Ronnie Sahlberg
0757edfd83 remove 61.nfstickles from the makefile
(This used to be ctdb commit 893465ddde0b730aa142f165cfdc4a57fc5517bf)
2010-08-30 18:29:56 +10:00
Ronnie Sahlberg
3376d9e72a we no longer have a 61.nfstickle script
(This used to be ctdb commit 8909d3a10362a8e58ffd71bc4cd035c12c584157)
2010-08-30 18:22:28 +10:00
Ronnie Sahlberg
2b4d9170c2 Merge commit 'martins/master'
(This used to be ctdb commit cc8c851e2e0b46f00b18a6dc61fd2774e97850dd)
2010-08-30 18:22:05 +10:00
Ronnie Sahlberg
92455c3dff remove the mention of a tickle and statd directory in shared storage now that we are removing these and migrating to store the data inside ctdbd or persistent databases
(This used to be ctdb commit 230bec8d375b778b20ff3cb7f9864c26323997f3)
2010-08-30 18:16:41 +10:00
Ronnie Sahlberg
12cc826231 Remove the dependency on the underlying cluster filesystem for handling
the clusterwide persistent data associated with the lock manager and
statd notifications.

Use persistent databases to store this data instead of a shared directory.

(This used to be ctdb commit fc0678d351187cfa4c71123f97c0f493aacd5d16)
2010-08-30 18:14:41 +10:00
Ronnie Sahlberg
c95f4258d8 Add a new event "ipreallocated"
This is called everytime a reallocation is performed.

    While STARTRECOVERY/RECOVERED events are only called when
    we do ipreallocation as part of a full database/cluster recovery,
    this new event can be used to trigger on when we just do a light
    failover due to a node becomming unhealthy.

    I.e. situations where we do a failover but we do not perform a full
    cluster recovery.

    Use this to trigger for natgw so we select a new natgw master node
    when failover happens and not just when cluster rebuilds happen.

(This used to be ctdb commit 7f4c591388adae20e98984001385cba26598ec67)
2010-08-30 18:09:30 +10:00
Martin Schwenke
46b9110f88 Test suite: Make NFS tickle test more flexible.
Use onnode any where possible rather than a fixed node.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 51561720d2b4db5b307da3d410661075e2a6c3ca)
2010-08-27 11:43:50 +10:00
Martin Schwenke
9878f8cbc2 Test suite: Fix NFS tickle test.
We now kill ctdbd on the test node instead of disabling it.  This
ensures that the only tickles we see will come from the takeover node.

We also sleep for TickleUpdateInterval before checking for asking ctdb
about the tickles.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 48cd8325c070f6942aa13a25269021e4c8ed188f)
2010-08-27 11:40:44 +10:00
Martin Schwenke
68717f689d Test suite: Tweak NFS tickle test.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c32ffd203e42a39010ce2d6e98253e8e48de515a)
2010-08-26 17:56:50 +10:00
Martin Schwenke
d7b169be9a Test suite: Fix typos in NFS tickle test.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c35d3e6341bc4e288393efa429b68bf6568b9b11)
2010-08-26 15:50:35 +10:00
Martin Schwenke
9235dc727a Test suite: NFS tickle test uses gettickles if events.d/61.nfstickle missing.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4763ccbfeaedd0fd953dbeda17ef9af41386688b)
2010-08-26 15:28:19 +10:00
Martin Schwenke
a104d1d823 NFS tickles: use addtickle/deltickle instead of shared tickle directory.
This adds a new function update_tickles() that tracks tickles for a
given port using the new ctdb addtickle/deltickle commands.  This
function is used in events.d/60.nfs to handle NFS tickles.

events.d/61.nfstickle is removed.  The
/proc/sys/net/ipv4/tcp_tw_recycle setup is also moved to
events.d/60.nfs.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit dca4c4ebf3c35f8db3ae208efb7a83abbf726ed6)
2010-08-26 14:59:59 +10:00
Martin Schwenke
0d2c554d5f Test suite: in the test eventscript, run "ctdb" not "$CTDB".
It is too hard to do anything else...

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 08b636b500855e38e708e6963d8e63ded97c25ec)
2010-08-26 14:04:03 +10:00
Martin Schwenke
6d15082045 Merge branch 'master' of git://git.samba.org/sahlberg/ctdb
(This used to be ctdb commit 090d9c8443cfa13d45f8c5d2845aea5aa9f7251d)
2010-08-26 11:06:57 +10:00
Ronnie Sahlberg
3edec07807 Add a configuration database, implemented as a persistent database.
This database can be used, as an option, to store
the public address assignment instead of editing the /etc/ctdb/public-addresses file manually.

This configuration is stored in one record per key, with a key-name of
public-addresses:node#<pnn>
where <pnn> is the node number.

The content of this record is the same syntax as the /etc/ctdb/public-addresses file.

When ctdbd starts, if this key exist and contains data. It is extracted from the database and compared with the normal file /etc/ctdb/public-addresses.

If the content differs, the config database "wins" and is used to overwrite/update the /etc/ctdb/public-addresses file, after which ctdbd is restarted.

The main benefit with this option is that it can be used to update the public address configuration for nodes that are offline/unreachable by updating their configuration in the persistent database.
Once the offline node is available again, it will resync its databases with the rest of the cluster, find out that the config has changed, apply the changes and restart ctdbd automatically.

The command to store the public address configuration for a node into the persistent database is :

ctdb pstore config.tdb public-addresses:node#<pnn> <filename>

where <pnn> is the node# we wish to update the config for, and <filename> is a file containing the new content for  that nodes public address configuration.

(This used to be ctdb commit 292d7435a360efd7f15a7a99f658a605e07c0a81)
2010-08-25 11:49:56 +10:00
Ronnie Sahlberg
55c619f072 the tfetch command can be used without the daemon running, so flag it as such.
fix a couple of incorrect settings for "auto-all" for a few of the commands as well.

(This used to be ctdb commit 9999771105d7105efaa232fe2842e21e66f78706)
2010-08-25 11:11:12 +10:00
Ronnie Sahlberg
018063b8eb add a new command "ctdb tfetch" that can read a record straight out of the
tdb file.

the command automatically strips off the initial ctdb header off the record so it can only be used on ctdb managed tdb files, not on normal tdb files.

(This used to be ctdb commit c3a816e5174abefb5155f65d8faad7b1e831e481)
2010-08-25 10:56:02 +10:00
Ronnie Sahlberg
f75b984b71 When "ctdb pfetch" creates a new file, make sure we set some initial sane mode bits
(This used to be ctdb commit 87160c91bfd87e8b9c510dacbf00e5aa481d2305)
2010-08-25 10:35:12 +10:00
Ronnie Sahlberg
ac335e3e5d run the "init" event before we freeze the databases
so that we can read from databases during this event

(This used to be ctdb commit 6c93bf5a1219617bfb39b093aee3200c74c2c61a)
2010-08-25 08:35:24 +10:00
Ronnie Sahlberg
4c5a4015f3 change "ctdb pfetch" to take an optional third argument
as a file to store the record in.

(This used to be ctdb commit 6d7e62f5401f0647a519fe0b74ec628418e33231)
2010-08-25 08:07:47 +10:00
Ronnie Sahlberg
a8db1adcd6 add a command to write a record to a persistent database
"ctdb pstore <db> <key> <file containing possibly binary data>"

(This used to be ctdb commit 14184ab7c80a3ef16c54b4ab168fd635b7add445)
2010-08-24 14:00:18 +10:00
Ronnie Sahlberg
4da818504a get rid of two compiler warnings
(This used to be ctdb commit 0865f0e6ef671396aa862f6a79a48a4891d72122)
2010-08-24 14:00:10 +10:00
Ronnie Sahlberg
401732a56b Add a command "ctdb pfetch <db> <record>" to read a record from
a persistent database.

(This used to be ctdb commit 3bef831b96ce8b40457ed4de527f0d62fa6a5b00)
2010-08-24 14:00:02 +10:00
Martin Schwenke
f2e2abbaad Merge branch 'master' of git://git.samba.org/sahlberg/ctdb
(This used to be ctdb commit 718ddc2264c28185fcddbc9cb0c7137d198a43a7)
2010-08-24 11:53:29 +10:00
Ronnie Sahlberg
ccdb91a169 move the directives to build the devel file to the end of the specfile
so that the dependencies are right
or else the dependencies all end up in the devel package and not the main
ctdb package

(This used to be ctdb commit 6e4347eb8e62c28987820f6e58626271c900b011)
2010-08-23 16:00:19 +10:00
Ronnie Sahlberg
e040a966af Dont set next_interval to 0.
This can cause ctdbd to spin at 100% in the eventsystem,
creating a timed event that will immediately trigger again
and again.

On uniprocessors this cause the eventscript we are actually waiting for to
basically become cpu starved and never complete.

(This used to be ctdb commit 92c8408fba957a8ded13f7e285da290502735234)
2010-08-20 15:00:45 +10:00
Ronnie Sahlberg
1ef66379d7 ctdb ip is very busy.
revert the defauls case back to only showing the ip and node
and only display the extra info if -v verbose output is requested

(This used to be ctdb commit 6488651aa7e105c57324f4a300760a010d098fbb)
2010-08-20 11:38:34 +10:00
Ronnie Sahlberg
08a5b0c7c5 add a new commandline flag -v to enable verbose output
(This used to be ctdb commit 96dd9f40f9464c3d9de98f1323568724a1e31dc9)
2010-08-20 11:28:24 +10:00
Ronnie Sahlberg
388d18cc93 make it possible to "ctdb gettickle" to only list tickles for a certain
port.

Default is to continue to show all tickles, but if a second argument
is given, only tickles for that port will be shown.

(This used to be ctdb commit 5b985eb2cbbb92bf6ccfcacd633d793bcd4e3ec1)
2010-08-20 11:25:12 +10:00
Ronnie Sahlberg
7229922d97 Dont use the deprecated talloc_append_string()
Use talloc_strdup_append() instead

(This used to be ctdb commit e41581347af5ef26d429d38ed48fa46244f0dbfc)
2010-08-20 11:03:17 +10:00
Ronnie Sahlberg
32a2297b20 We need the deprecated talloc_append_string() for now
so set the TALLOC_DEPRECATED sympol to allow use of this call
from ctdb_client.c

(This used to be ctdb commit 3afa5d945a56952a7f211af068d671945de960e5)
2010-08-19 14:48:19 +10:00
Ronnie Sahlberg
2e8aac6689 Merge commit 'rusty/ports-from-1.0.112' into foo
(This used to be ctdb commit 13e58d92f5f1723e850a82ae030d0ca57e89b1ee)
2010-08-19 13:17:56 +10:00
Ronnie Sahlberg
4c05f1900c Merge commit 'rusty/vacuum-fix-master'
(This used to be ctdb commit dc301b324d2c14a2425a965c076113c4fe97903e)
2010-08-19 13:16:35 +10:00
Ronnie Sahlberg
729f1ddea0 On RHEL, "service nfs stop;service nfs start" and "service nfs restart"
sometimes (very rarely) fails to restart the service.

    Add a function to restart NFSd on SLES and RHEL-like systems.

    If we detect the system is unhealthy due to kNFSd not running,
    try to restart the service again "service nfs restart" and
    hope for the best.

CQ1019372

(This used to be ctdb commit 25c4ce7e919f13226219f036bcffd2be76b2f06c)
2010-08-19 07:18:22 +10:00
Ronnie Sahlberg
31126b2ef0 Add machinereadable output for the "ctgdb gettickles <ip>" command
(This used to be ctdb commit c3eb53509331045074579468d94ed7e31101bba4)
2010-08-18 14:37:16 +10:00
Ronnie Sahlberg
5aa5f3e7bf Remove the structure ctdb_control_tcp_vnn since this is identical to the structure ctdb_tcp_connection.
Add a new "ctdb deltickle" command to delete tickles from the database.
This can ONLY be used for tickles created by "ctdb addtickle".

Push any "addtickle/deltickle" updates to other nodes every TickleUpdateInterval seconds'

(This used to be ctdb commit acded034e2f0dcae4c2c9e54e16a001caf23caec)
2010-08-18 12:36:03 +10:00
Rusty Russell
9fbb191b78 logging: give a unique logging name to each forked child.
This means we can distinguish which child is logging, esp. via syslog where we have no pid.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 68b3761a0874429b90731741f0531f76dcfbb081)
2010-08-18 11:46:32 +09:30
Rusty Russell
1a009aff73 takeover: prevent crash by avoiding free in traverse on RST timeout
After 5 attempts to send a RST to a client without any response, we free
"con"; this is done during a traverse.  This frees the node we are walking
through (the node is made a child of "con" down in rb_tree.c's
trbt_create_node() (Valgrind would catch this, as Martin confirmed).

So, we create a temporary parent and reparent onto that; then we free
that parent after the traverse, thus deleting the unwanted nodes.

CQ:S1019041
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 08f7f85477610a4916c1ec866aa467b28f1bbec3)
2010-08-18 11:40:17 +09:30
Martin Schwenke
6ce1501aa1 Move NAT gateway firewall rules to recovered|updatenatgw events.
The existing code wasn't working as designed in the start event.  It
should work here.

BZ: 62613
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit aeb70c7e7822854eb87873a5c7783e27e6e72318)
2010-08-18 11:40:07 +09:30
Rusty Russell
5f2d43157d vacuum: disabling vacuuming during a freeze
We shouldn't even think about vacuuming when we've frozen the database
(which is earlier than when we set CTDB_RECOVERY_ACTIVE)

CQ:S1018154 & S1018349
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit d8df6835a931082af232c4b94f1dede6f16169f9)
2010-08-18 11:01:52 +09:30
Rusty Russell
0b07f91d36 vacuum: fix crash on vacuum abort
Martin Schwenke discovered that 517f05e42f17766b1e8db8f1f4789cbad968e304
("freeze: abort vacuuming when we're going to freeze.") used ctdb_db for
a logging message which is in fact uninitialized, causing a crash (even
if it wasn't actually logged).

Initialize it properly.  Also fix incorrect format in another logging
message introduced in that same change.

CQ:S1019093
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 8e518950ba281502318d6300f7a5ec6cdf6b5674)
2010-08-18 11:00:11 +09:30
Martin Schwenke
4e9fe3545c Test suite: loosen the getmonmode test.
Monitoring could be off at the beginning of the test.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 6a33a7715067175869ea2f3f15b64c3371079a6b)
2010-08-18 11:25:44 +10:00
Rusty Russell
af55c910a4 freeze: abort vacuuming when we're going to freeze.
There are some reports of freeze timeouts, and it looks like vacuuming might
be the culprit.  So we add code to tell them to abort when a freeze is
going on.

(This is based on the 1.0.112 branch version 517f05e42f, but far
 simpler since tdb is now robust against processes being killed during
 transaction commit)

CQ:S1018154 & S1018349
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit f5d7dc679501e607c2c83a248a89d3cada9df146)
2010-08-18 10:54:28 +09:30
Ronnie Sahlberg
44ff992806 Add a new "ctdb addtickle" command to manually add tickles to ctdbd
This can be used to set ctdbd up to generate a tickle for non-samba
services.
(samba contains code to set tickles up automatically)

(This used to be ctdb commit 7ef2cddad5326fdcc26138906948342039829495)
2010-08-18 11:09:32 +10:00
Ronnie Sahlberg
0e5be63bca update the example for the new signature of
ctdb_set_message_handler_send()

(This used to be ctdb commit 6aabe52d5ba629291aa630bc96a2b74dcecc5209)
2010-08-18 10:18:35 +10:00
Ronnie Sahlberg
e8ffb0d8a4 We use eventloop nesting in a couple of places, notably the sync
parts of the recovery daemon.

Initialize all event contexts to allow nesting

(This used to be ctdb commit 5bf6bd5e7f33aabbeb7b9707716ef99cf471e590)
2010-08-18 10:11:59 +10:00
Ronnie Sahlberg
ddf3c621c1 Merge commit 'rusty/libctdb-new' into foo
(This used to be ctdb commit 1566d2d23ab698896b3b6a76974a5c7452db4a62)
2010-08-18 09:53:52 +10:00
Rusty Russell
f93440c4b7 event: Update events to latest Samba version 0.9.8
In Samba this is now called "tevent", and while we use the backwards
compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now
a separate tevent_fd_set_auto_close() function.

This is based on Samba version 7f29f817fa.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)
2010-08-18 09:16:31 +09:30
Rusty Russell
532e4a7077 talloc: update to 2.0.3 version from SAMBA
This is based on SAMBA as at revision 2de63aa280.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit cecd93be0a0aab868430dd43f8276bfb4e35f02e)
2010-08-18 09:11:58 +09:30
Martin Schwenke
a3e9fe2058 Test suite: Add more timestamping of debugging information.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4cdf3b9adc7edfd80a2901ef8457ae67aab0829a)
2010-08-17 09:55:48 +10:00
Martin Schwenke
e28d2b8f22 Test suite: print date/time at test completion.
This should help with log cross-checking.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c0a916c40c623c0aa8245526283a064dbeea4b57)
2010-08-17 09:52:15 +10:00
Volker Lendecke
a79168f587 Correctly set docdir
(This used to be ctdb commit a69916d0687309766b0014dc9cee6a966aaa89da)
2010-08-16 11:28:05 +10:00
Rusty Russell
c27094742b tdb: workaround starvation problem in locking entire database.
(Imported from SAMBA 11ab43084b)

We saw tdb_lockall() take 71 seconds under heavy load; this is because Linux
(at least) doesn't prevent new small locks being obtained while we're waiting
for a big log.

The workaround is to do divide and conquer using non-blocking chainlocks: if
we get down to a single chain we block.  Using a simple test program where
children did "hold lock for 100ms, sleep for 1 second" the time to do
tdb_lockall() dropped signifiantly.  There are ln(hashsize) locks taken in
the contended case, but that's slow anyway.

More analysis is given in my blog at http://rusty.ozlabs.org/?p=120

This may also help transactions, though in that case it's the initial
read lock which uses this gradual locking routine; the update-to-write-lock
code is separate and still tries to update in one go.

Even though ABI doesn't change, minor version bumped so behavior change
can be easily detected.

CQ:S1018154
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 9ec0009443a0ac4187ce5212a5143689daa58a02)
2010-08-16 10:22:21 +09:30
Rusty Russell
546eff9c93 tdb: Fix tdb_check() to work with read-only tdb databases.
(Import from SAMBA bc1c82ea13)
The function tdb_lockall() uses F_WRLCK internally, which doesn't work on
a fd opened with O_RDONLY. Use tdb_lockall_read() instead.

(This used to be ctdb commit a5db1122ec48d7e7384066848457c850c1a6cf3c)
2010-08-16 10:20:59 +09:30
Rusty Russell
fa2a32d5ef tdb: remove unused variable in tdb_new_database().
(Imported from SAMBA 2eab1d7fdc)

(This used to be ctdb commit 52a87e608d0406aee9df99f7ac3ce16e834b520b)
2010-08-16 10:20:53 +09:30
Rusty Russell
55010cab63 tdb: fix short write logic in tdb_new_database
Commit 207a213c/24fed55d purported to fix the problem of signals during
tdb_new_database (which could cause a spurious short write, hence a failure).
However, the code is wrong: newdb+written is not correct.

Fix this by introducing a general tdb_write_all() and using it here and in
the tracing code.

Cc: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 27ba0e5a6681063225df7244a85aa304c51c6948)
2010-08-16 10:20:19 +09:30
Martin Schwenke
03aa9ee702 Test suite: strengthen function _cluster_is_healthy().
If there's a chance that "ctdb status -Y" can return 0 but print
garbage then this function might return a false positive.

So, we do 2 things:

* Redirect stderr to >/dev/null rather than looking at it.  This
  minimises the chance that we will see garbage.

* Since we need at least 1 good line to decide the cluster is healthy,
  we sanity check each line to esnure it starts with :[0-9].

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d4189c7c3fceaa833f9f0446a2b06af6fed714ec)
2010-08-13 17:01:54 +10:00
Martin Schwenke
a9fb1e318b Test suite: use $CTDB rather than ctdb everywhere in ctdb_test_functions.sh.
Also ensure that $CTDB is set by default it to "ctdb".

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 8222fef1e61836b9bfd406205f9ffb9396aa7480)
2010-08-12 14:13:07 +10:00
Martin Schwenke
00fec0e76c Test suite: improve wait_until_node_has_status()
This currently does "onnode any ... wait_until ...".  If ctdbd is
being shutdown on a node then that node might be chosen anyway, if it
is asked early enough.  Then we'll loop on that node but our ctdb
client command may always fail, causing a timeout rather than the
expected behaviour.

This puts the loop on the outside of the "onnode any" so that if the
"wrong" node is chosen initially then on the next iteration the choice
can be remade.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a88ee78686bd5aa2b789f5959e0562315a13525d)
2010-08-12 13:48:33 +10:00
Martin Schwenke
d549c31031 Test suite: make addip test use $CTDB rather than ctdb in debug code.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5bb6b28ab7b45b7242d100ae8f1483d02e1d0d1d)
2010-08-11 16:55:33 +10:00
Ronnie Sahlberg
8b0bbf960b Create a new command "ctdb sync" that isd just an alias for "ctdb ipreallocate"
(This used to be ctdb commit eededd592c92c59b435f0046989b2327fcc280b1)
2010-08-10 09:49:55 +10:00
Ronnie Sahlberg
7139faaeac Update a log message to reflect that this does no longer only happen
when trying/failing to ban a node.

(This used to be ctdb commit dc6b143c4785449e8c4ef7a46bf16adba750ab56)
2010-08-10 09:48:50 +10:00
Rusty Russell
a65cb6a9ae libctdb: add synchronous message handling and unregister, with tests.
It turns out that we *do* want a separate private arg for the message
handler and the completion callback, so we change that.

We also fix the prototypes of the remove_message functions as we
implement them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 332375246eccd95da626f434f6d49dd9458a9787)
2010-08-09 15:41:32 +09:30
Ronnie Sahlberg
f7ead50738 Merge remote branch 'martins/master'
(This used to be ctdb commit 9ca09ee9129b787428a2ceac9731b12166dc8718)
2010-08-09 11:35:38 +10:00
Martin Schwenke
0f18859a6c Add some command-line options to ctdb_diagnostics.
In some contexts ctdb_diagnostics generates too many errors when it is
run on heterogeneous and machine-configured clusters.  In some
clusters some nodes are expected to be differently configured and also
machine-generated configured files can have comments containing
timestamps.

This adds some command-line options that can be used to reduce the
number of errors reported:

    -n <nodes>  Comma separated list of nodes to operate on
    -c          Ignore comment lines (starting with '#') in file comparisons
    -w          Ignore whitespace in file comparisons
    --no-ads    Do not use commands that assume an Active Directory Server

The -n option simply allows ctdb_diagnostics to operate on a subset of
nodes, avoiding file comparisons with and data collection on nodes
that are differently configured.  For file comparisons, instead of
showing each file on the current node and then comparing other nodes
to that file, the file from the first (available or requested) nodes
is shown and then other nodes are compared to that.  That has resulted
in changes in output - that is, ctdb diagnostics no longer prints
messages referencing the current node.

-c and -w are used to weaken comparisons between configuration files.

--no-ads can be used to avoid running ADS-specific commands if a
cluster uses LDAP (or other non-ADS) configuration.

This also fixes a number of bugs in related code:

* A call to onnode was losing the >> NODE ...  << lines because they
  now go to stderr.  This was changed in onnode long ago but
  ctdb_diagnostics was never updated to match.

* ctdb_diagnostics was counting lines in /etc/ctdb/nodes to determine
  what nodes to operate on.  For some time the nodes file has
  supported syntax that makes this invalid.  "ctdb listnodes -Y" is
  now used to list available nodes.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 36c8244a0f68c7c9bbee40982f230e9d14d3c0ea)
2010-08-06 11:10:56 +10:00
Ronnie Sahlberg
4424c115cb iupdate the docs that ctdb freeze is no more
(This used to be ctdb commit 79ef9909dfa0904d789c69eb6b9c80e8908a1100)
2010-08-05 16:35:37 +10:00
Ronnie Sahlberg
043045dcc5 remove the "ctdb freeze" debugging command
(This used to be ctdb commit bd005b987255eb65cd3826dce984281ee757daf6)
2010-08-05 16:30:47 +10:00
Martin Schwenke
b50ec65963 Test suite: remove unnecessary verbosity from enable/continue tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 69c95b2a42f55b80cd8d91a90ab55166f964163b)
2010-08-05 16:03:21 +10:00
Martin Schwenke
f66b5b46d6 Test suite: Fix typo in continue test.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c2bce140da7c4b118394ee77bb9d0348d27e7e95)
2010-08-05 16:01:23 +10:00
Martin Schwenke
77ad2be488 Test suite: weaken ctdb continue/enable tests for non-deterministic IPs.
These tests currently wait for the old IPs to fail back to the test
node.  This isn't guaranteed with DeterministicIPs disabled.

This changes those tests to wait until the test node gets at least 1
IP assigned.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e9b3f5b1b51d541a911a27eb4348b368f28d185e)
2010-08-05 15:58:56 +10:00
Martin Schwenke
b930c885b3 initscript: wait until we can ping ctdbd before setting tunables.
Currently we do a "sleep 1" after starting and before running
set_ctdb_variables to set the tunables.  This is too arbitrary and
might fail if the system is heavily loaded.  This, for example, could
result in some nodes running with DeterministicIPs and some without,
in which case a different IP allocation algorithm would run depending
on who is the recmaster!

This makes the start function wait until "ctdb ping" succeeds (with 10
second timeout) before trying to run set_ctdb_variables.  If a timeout
occurs then the start function attempts to kill ctdbd before exiting
with a failure.

It also cleans up the status reporting code for Red Hat and SUSE so
that the final status code is reported.  Currently there are cases
where a correct status is prematurely reported before a failure
occurs.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit cdcd05662a30b51caaeeab4ac44138cac2474e0a)
2010-08-05 15:29:40 +10:00
Martin Schwenke
774582c360 Test suite - make the ctdb_fetch test cope with "Reqid wrap!" messages.
Recent CTDB notice the wrap and print this message.  The test needs to
cope.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b93b60ec96d02ce4f54921e85a5c5554d1fc0c55)
2010-08-05 13:43:50 +10:00
Martin Schwenke
dff9282917 Test suite: remove thaw/freeze tests.
They test debugging commands that no longer operate as expected.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d33fa4d6557aab1938049f194c2de55f2c395bd2)
2010-08-05 11:40:05 +10:00
Martin Schwenke
4817f7e4ba Test suite - fix addip test.
The test currently checks that all existing IPs plus the newly added
IP are on the test node after "ctdb addip" is run.  With
DeterministicIPs enabled, if the new IP is "before" other IPs then the
other IPs may be shuffled by the deterministic IPs modulo algorithm.
This will happen on the 1st recovery after the move.  Sometimes this
recovery happens before we get the list of IPs to check and sometimes
after, so the test is racy.

The fix is to simply check for the presence of the new IP and not
worry about the others.  This reduces whatever value this test
had... but you can't have everything.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 1ef7c8e64c7a39330be09ae4d00b70238133e0b5)
2010-08-04 16:08:12 +10:00
Martin Schwenke
9aa6a99740 Merge remote branch 'martins/master'
(This used to be ctdb commit 5d9e4b6ee7d2b5290a74e7be79bdf51a43b72f43)
2010-08-04 16:05:39 +10:00
Martin Schwenke
7edcb89857 Test suite - try to make addip test more reliable and add some debugging.
This test is failing in some situations.  The "ctdb addip" command
works but the IP never appears in the "ctdb ip" output.

Try restricting the last octet to be between 101-199.  At the moment
addresses like 10.0.2.1 are being chosen and these are often the
address of the host machine in autocluster configurations... so might
cause weirdness.

Also add some debugging if checking for the IP address times out.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ae52cb63756bc60de8d32e01bac5d70975a1c7a0)
2010-08-04 13:16:06 +10:00
Martin Schwenke
807567e992 Testing: IP allocation simulation - add option to change odds of a failure.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b2a2e301025d7fbfe5eeaac436693cde6d404490)
2010-08-03 11:51:14 +10:00
Martin Schwenke
4ffb6495ff Testing: IP allocation simulation - clean up usage message.
Group options better and make the language consistent between options.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit bc38c17e4115fae00c89d00537fdcfe621111b37)
2010-08-03 11:41:50 +10:00
Martin Schwenke
4728bf6ece Testing: IP allocation simulation - print maximum number of unhealthy nodes.
This can imply something about imbalance.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ecb80e2b6be9326708d1fc87ad3028c6836d5858)
2010-08-03 11:37:34 +10:00
Martin Schwenke
8ca925fe5d Testing: IP allocation simulation - improve help for options.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 058501b92f602e7d2240d1cb08ed78a807564c48)
2010-08-03 11:36:33 +10:00
Martin Schwenke
8cc6ed1d0e Testing: IP allocation simulation - make usage/failure more obvious.
Tweak the usage message for -g option.

Print an error if no node groups defined, instead of curious Python
error.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 8b883eb9346b8278d268e35b56ac680cd9526b97)
2010-08-02 15:46:23 +10:00
Martin Schwenke
326514f152 Testing: IP allocation simulation - rename an example to node_group_extra.py.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 974f849df0aca2cfedb38fa815894955e32803a8)
2010-08-02 15:09:13 +10:00
Martin Schwenke
d438b2398f Testing: IP allocation simulation - rename an example to node_group_simple.py.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 0a2a5602233a8208e2729192e50d816faed0151a)
2010-08-02 15:07:56 +10:00
Martin Schwenke
4af049780a Testing: IP allocation simulation - add general node group example.
This allows node pool configuration to be specifed on the
command-line.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d382d9023928f75f360a115ae1e9c1036423416e)
2010-08-02 15:06:39 +10:00
Martin Schwenke
0fdd7566c7 Testing: IP allocation simulation - update options processing in examples.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a65ca1a71386f40080dd553756f3600d3b20d523)
2010-08-02 15:01:47 +10:00
Martin Schwenke
ecdbd99557 Testing: IP allocation simulation - Update README.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ed64b7f2b3cd920bb0f5dfd7f64ed0afc0b99fc1)
2010-08-02 14:58:15 +10:00
Martin Schwenke
ae0f339173 Testing: IP allocation simulation - fix nondeterminism in do_something_random().
The current code makes random choices from unsorted lists.  This
ensures the lists are sorted.

Also, make the code easier to read by doing the random selction from
lists of PNNs rather than lists of Node objects.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a01244499dc3567f5aa934b1864b9bc183a6c242)
2010-08-02 14:24:00 +10:00
Martin Schwenke
eac5edf322 Testing: IP allocation simulation - Tweak options handling and Cluster.diff().
process_args() must now be called by programs inporting this module.
Options are put into global variable "options", which can be
references using "ctdb_takeover.options".

Can now pass extra option specifications to process_args().

Remove global variable prev and make it a Cluster object variable.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a32298e7bc819694518e859f100f9444ff5663cd)
2010-08-02 14:20:12 +10:00
Martin Schwenke
ef77f613fa Testing: IP allocation simulation - update copyright message.
There's a lot of new code here, so let's make the copyright message
make sense.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e6e56e5989def6704b116e806c1f261c7f3fc03f)
2010-08-02 14:16:02 +10:00
Martin Schwenke
de4223fa54 Testing: IP allocation simulation - add command line option for random seed.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 8362029c7cfc1041e46ee2116aa5cade6edce435)
2010-08-01 11:53:28 +10:00
Martin Schwenke
2e570851d5 Testing: IP allocation simulation - save some warnings for verbose mode.
We don't need to see warnings about unallocatable IPs unless we're in
verbose mode.  Can node be run with -n (and without -v or -d) to see
just the statistics.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 55370936ac5def5ebf138910388a2ddc2df9c20f)
2010-08-01 11:41:52 +10:00
Martin Schwenke
d7d2c64834 Testing: IP allocation simulation prints final imbalance in statistics.
This is useful to know.  When things get unbalance they tend to stay
that way.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a40faa2096effc2657ac05b729f3259bbb2e1fed)
2010-08-01 11:41:02 +10:00
Martin Schwenke
d7e996915c Testing: In IP allocation simulation count total number of events.
This starts at -1 because we always have to do the initial allocation.

No longer print event number for each event by default, only when
verbose is enabled.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c9a761726d141bcaa8ba7851150f71a8130b473a)
2010-08-01 11:39:30 +10:00
Martin Schwenke
88337cb080 Testing: Add imbalance information to IP allocation simulation.
Implement the imbalance calculations.

Also add command-line option to display imbalance for each step.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f50a12f6d06ed67efadd2a892d62c01e67310e7d)
2010-08-01 11:37:35 +10:00
Martin Schwenke
f73f2d7581 Testing: Add Python IP allocation simulation.
Includes simulation module and example scenarios.  This allows you to
test and perhaps tweak an algorithm that should be the same as the
current CTDB IP reallocation one.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d148e7a7cb840febbdf56ba2e39c314cc2d7ac24)
2010-07-30 16:50:45 +10:00
Martin Schwenke
fe64a8f87a Optimise 61.nfstickle to write the tickles more efficiently.
Currently the file for each IP address is reopened to append the
details of each source socket.

This optimisation puts all the logic into awk, including the matching
of output lines from netstat.  The source sockets for each for each
destination IP are written into an array entry and then each array
entry is written to the corresponding file in a single operation.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 6549e9b01538998d51a5f72bfc569776d232b024)
2010-07-30 16:50:18 +10:00
Martin Schwenke
5027d2b8b0 Test suite: handle extra lines in statistics output.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b2362cc7773bb08c7dfdaf2c87d4b59460686659)
2010-07-30 16:50:00 +10:00
Martin Schwenke
697fcfd15a Test suite: handle change to disconnected node error message.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 20ea31e4ed893eb58cb2efa0b6fb13bcf4031918)
2010-07-30 16:49:51 +10:00
Ronnie Sahlberg
ddb1c74066 Add a code-style document.
Shamelessly sto^H^H^Hborrowed from samba3.

(This used to be ctdb commit 8024d9e2d589bfe4dee1cb9a79bec663738cb7fa)
2010-07-30 16:37:22 +10:00
Stefan Metzmacher
794230775c events/10.interface: we need to mark interfaces as "up" if we don't know how to monitor them
metze

(This used to be ctdb commit 1e08d1578d1960fcfc5fdd85492fbd6d194e5e94)
2010-07-30 16:33:27 +10:00
Ronnie Sahlberg
c5de7cfb8c Merge commit 'rusty/master'
(This used to be ctdb commit b4391c00476cde74101736986dfcd2be6c959edc)
2010-07-30 16:25:40 +10:00