1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-17 02:05:21 +03:00

3610 Commits

Author SHA1 Message Date
Martin Schwenke
667a743fff Test suite: Strip architecture suffix from CTDB RPM package version.
Later versions of RPM seem to include it but we don't want it.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 6c8eedb21a5e231d4a26ac26706ea51f348a27e0)
2011-08-08 16:34:48 +10:00
Martin Schwenke
b3db37cd30 Test suite: remove getmonmode test.
This can't be made 100% reliable since the monitor mode can change
underneath us due to some event.  Therefore, the test is useless.

Signed-off-by: Martin Schwenke <martin@meltin.net>

Conflicts:

	tests/simple/20_ctdb_getmonmode.sh

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 98ccdc6241a73036c4f210bb510f1cb5cff588cc)
2011-08-08 16:33:47 +10:00
Martin Schwenke
94f0fd9cd5 Test suite: Try much harder to get a healthy cluster when it is restarted.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 91e74cb01a11012e41ef9633c98f13ddbb2e5908)
2011-08-08 16:28:30 +10:00
Martin Schwenke
000fbb607e Test suite: when the cluster flip-flops (un)healthy, using "ctdb status -Y".
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d3dc9410501767c07d9b0106bb73c979d869c127)
2011-08-03 16:06:40 +10:00
Martin Schwenke
7e48ba58c6 Test suite: Print debug info from cluster nodes when time jumps occur.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 21cdc7ed6942238faeb42983c862d4abc3f54ffb)
2011-08-03 16:06:40 +10:00
Martin Schwenke
372f0a1bff Test suite: Add debug for cluster (un)healthy flip-flop after restart.
We're seeing the cluster become healthy after a restart and then
revert to being unhealthy.  It looks like there's a race and the
cluster shouldn't have been healthy, given that we seem to see that
the monitor cycle hasn't yet been run.

This collects some state debug info from all nodes after the cluster
becomes healthy.  This is printed if the cluster is then unexpectedly
unhealthy a short time later.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c2efb5897e4258df649149f9904d7ac47322e1b4)
2011-08-03 16:06:40 +10:00
Martin Schwenke
659f54e61a Test suite: add more debug to time jump post mortem.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit fed3c2b80b8add8d1cf33abdd5dd8d8001af44d4)
2011-08-03 16:06:40 +10:00
Martin Schwenke
e05b902f99 Test suite: add automated checking of time logs.
This depends on the format of onnode output and also depends on
simple/00_ctdb_onnode.sh having been run.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 93b53b186df55942bf4d9e90cae329f47889af72)
2011-08-03 16:06:40 +10:00
Martin Schwenke
7e5549a54e Test suite: make time log use seconds since epoch.
Easier to implement automatic checking.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 18db530880849b59445d7aa508bf218bdd77ea1c)
2011-08-03 16:06:40 +10:00
Martin Schwenke
88fc88caf5 Test suite: CTDB_SAMBA_SKIP_SHARE_CHECK test now uses _loadconfig().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 651e6703b6dc4d11ba7d6d0b44d3be1f485a0f75)
2011-08-03 16:06:40 +10:00
Martin Schwenke
3a18451cef Test suite: CTDB_NFS_SKIP_SHARE_CHECK test now uses _loadconfig().
The manual replacement of loadconfig() had bit rotted and no longer
worked.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit bf23e7166385d305c6860b37c120f70a9aa33aa5)
2011-08-03 16:06:40 +10:00
Martin Schwenke
4f4cf7b100 Test suite: make time logging only happen on a real cluster, not local daemons.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a6b3a7b7db9aa5fc971aae11b9b012e72c7d240c)
2011-08-03 16:06:40 +10:00
Martin Schwenke
bb32a6cf70 Test suite: add time logging.
We're seeing some weirdness with CTDB controls timing out.  We're
wondering if time is jumping forward, so this creates a time log on
each node that we can examine later if tests fail weirdly.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5d82d89ee99f10bead101aebda645a80435ba246)
2011-08-03 16:06:40 +10:00
Martin Schwenke
7cec7807e1 Tests: eventscripts and onnode tests use stubs/ subdirectory instead of bin/.
This sets up a more useful convention and avoids future .gitignore
problems.

Resolved conflict while cherry-picking this:

  Don't take the eventscripts files for this branch.  We'll put them
  elsewhere.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a9879e37d4e3bb714ef6c0c4144c6949daec0b53)
2011-08-03 15:55:35 +10:00
Martin Schwenke
8006aec7b1 Tests: run_tests script no longer prints filename in summary descriptions.
If filenames should be printed in descriptions in the summary then the
descriptions should include the filename.  A better option is to
include something more human-readable that makes the test just as
easily identifiable.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 0efdbd61bdc2343e5459959b300bccc9986b1d78)
2011-08-03 15:51:44 +10:00
Martin Schwenke
3ee6a63e47 Tests: onnode tests changed to use a simple define_test() function.
This makes global changes easier.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 3af086398fecb5f7c501190f9620b9c7b201f0ca)
2011-08-03 15:51:44 +10:00
Martin Schwenke
51ef4b4e55 Tests: add initial onnode tests
Add some simple tests for the onnode command.  These use fake ssh and
ctdb commands that are added to $PATH.  The infrastructure used is
quite flexible and would allow more complex tests to be written.

As-is, these tests expose some bugs in the an older version of onnode
that is included so it can be used to validate some of the tests.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f7f9d0943474cb2de7832d7ca95210ea9e9c772b)
2011-08-03 15:51:44 +10:00
Martin Schwenke
8d2c726deb Tests: change output format of run_tests script and add -q option
Putting PASSED/FAILED on the left makes it easier to scan the results
and simplifies the code.  Also put starts around the word "*FAILED*"
to make it more obvious.

Also add a -q option to throw away test output and only display the
summary (if -s is also specified).

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c44b632b010b7d57007f3c8f294271c7e0217e0d)
2011-08-03 15:51:44 +10:00
Martin Schwenke
eae91c959e Test suite: add a -d option to the run_tests script.
This causes summary lines (when used with -s) to be pretty printed and
include the test description.  This is the 4th line of the test output
- that is, immediately after the header.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 0e5cc2a58b0d38e10a2ef9e81dc887c20f3fbdcb)
2011-08-03 15:51:44 +10:00
Ronnie Sahlberg
82d59bbc8e make test: add two missing events to the special test eventscript
(This used to be ctdb commit 771b1e9c2e694ccc8825fb8088174c122532e74d)
2011-08-02 19:25:14 +10:00
Martin Schwenke
652bf326e1 Eventscripts - 10.interfaces should not check orphaned interfaces.
If the last IP address on an interfaces is removed then that
interfaces should no longer be checked by 10.interfaces.  However,
"ctdb ifaces" still lists such interfaces so they are currently
checked.

The problem really needs to be addressed in ctdbd but a neat quick
eventscript fix will be minimally invasive...

This changes the code to use "ctdb -Y ip -v" instead of "ctdb -Y
ifaces".  The former includes details of all public addresses and
associated interfaces, so when an address is removed there is no
output for it.  This avoids orphaned interfaces from being listed.

The logic is also slightly improved so that $IFACES includes just a
(non-uniquified) list of interfaces, allowing an existing loop to be
removed.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 49b2d1bd9554461ed8edbfc21e777c0eca9e1443)
2011-08-02 16:53:14 +10:00
Ronnie Sahlberg
a17ae8a8be Merge branch 'master' of 10.1.1.27:/shared/ctdb/ctdb-master
(This used to be ctdb commit 518945e59e2e48f07fcc0955f3aa81cd0d946aea)
2011-07-29 09:04:01 +10:00
Martin Schwenke
5ac67504ca Tests: Initial test code for LCP2 IP allocation algorithm.
Move struct ctdb_public_ip_list to ctdb_private.h and put some
definitions for some functions from ctdb_takeover.c there.  This
allows those functions to be called from unit tests.

Add ctdb_takeover_tests.c and the Makefile support to build it.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9d34be0233edf3bc022345c0494c4b2a4d7f8480)
2011-07-29 09:01:36 +10:00
Martin Schwenke
ff1a81c872 IP allocation - add LCP2 algorithm.
The current non-deterministic IP allocation algorithm balances IPs
across the whole cluster.  It does not consider different
interfaces/VLANs/subnets, so these different groups of IPs aren't
generally well balanced.

This adds the LCP2 algorithm for IP allocation and allows it to be
enabled by setting the "LCP2PublicIPs" tunable to 1.

The LCP2 algorithm calculates the imbalance of a node by totalling the
squares of the distances between each IP on the node.  The IP distance
is defined as the length longest common prefix (LCP) of bits that is
found when comparing 2 IPs.  The imbalance of a cluster is the maximum
imbalance for any node.  At each step the algorithm selects an
allocation to the IP/node combination that results in the choosing the
allocation that best reduces the imbalance of the cluster.

The implementation splits out the IP allocation part of
ctdb_takeover_run() into new function ctdb_takeover_run_core(), and
then extracts out the basic IP assignment code into new functions
basic_allocate_unassigned() and basic_failback().  3 new functions
lcp2_init(), lcp2_allocate_unassigned() and lcp2_failback() implement
the LCP2 algorithm, and are hooked into ctdb_takeover_run_core().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 61fc7fbd0235469df22deb6581c6bd47e30bc0be)
2011-07-29 09:01:17 +10:00
Ronnie Sahlberg
a5cd8a3270 Merge branch 'master' of 10.1.1.27:/shared/ctdb/ctdb-master
(This used to be ctdb commit 0e60a738f9a6275ed45abc3d933f872d93132d92)
2011-07-29 08:53:43 +10:00
Ronnie Sahlberg
e707f23596 Update the delip command
Dont talloc_free(vnn) immediately but postphone it until later when
the eventscript callback has completed.

CQ S1026664

(This used to be ctdb commit 0a99e8742a261b1d3a2c8830f5c19ea6c2c47cad)
2011-07-29 08:50:48 +10:00
Rusty Russell
87ea4818bf eventscript: fix callback after free
ctdb_event_script_callback() takes a mem_ctx arg which it doesn't use, but
the implication is pretty clear, that when that mem_ctx is freed, the callback
shouldn't happen.  Indeed, Ronnie reproduced a case where that callback
refers to freed memory, in the ip reallocation code under stress.

So attach the callback to the mem_ctx they give us, and remove it from the
script state structure when that's freed.  It's a bit weird, but it works.

CQ: S1026179
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 6fcd867cc835ef1ffc1c50964f135c346503d40c)
2011-07-29 08:50:39 +10:00
Michael Adam
fe114ca5d5 packaging: honour rpm build target options handed in to makerpms.sh
This allows to call e.g. "makerpms.sh -bs" to build only the source RPM.

(This used to be ctdb commit c6bfba2bb66962b7b05d708f0747002700991472)
2011-07-22 15:48:27 +02:00
Ronnie Sahlberg
69024f40bf Merge branch 'master' of ssh://git.samba.org/data/git/ctdb
(This used to be ctdb commit a1b3661973489f0111e7975fec422fb99a25f0c8)
2011-07-20 15:53:11 +10:00
Ronnie Sahlberg
0a86d6ed91 Add a text about "ban" "unban" not being permanent and htat recovery daemon can auto unban nodes. Suggest using "stop" / "continue" instead.
(This used to be ctdb commit 8e30dffad5b1385818b2d7350d6c3767a220d745)
2011-07-09 07:14:32 +10:00
Michael Adam
926a6d7f05 web: correctly terminate list items <li> with </li> instead of with <br>
(This used to be ctdb commit 3f698e69a56305c5ec27b8d119bf2d57d5cd2ec6)
2011-07-08 10:07:42 +02:00
Michael Adam
577f58a8be web: add Stefan Metzmacher to the list of CTDB developers.
(This used to be ctdb commit 912a33cebe7c51b33cda2e6d5f2b3a481fa7fd49)
2011-07-08 10:06:14 +02:00
Ronnie Sahlberg
c93a968619 When trying to re-balance the ip assignment and shuffle ips from
nodes with many addresses to nodes with few addresses,
loop up to num_ips+5 times instead of only 5 times.

When we have very many public ips per node, we might need to loop more than
5 times or else we will exit without reaching optimal balance.

(This used to be ctdb commit aa8114a625a637277561a66c80bdece3c27e9e20)
2011-07-06 13:14:13 +10:00
Ronnie Sahlberg
20a7c19691 Add log output to wipedb and backupdb
CQ S1025379

(This used to be ctdb commit 6f51d4a75f8a9f2cdb8ecde946ed31809ab5a415)
2011-07-06 13:13:18 +10:00
Ronnie Sahlberg
18af72f08f change the name for the key for the record where we stoire the public address config from public-addresses... to public_addresses...
CQ1019030

(This used to be ctdb commit 114d5034ff4880848588caf493382a537a1469ae)
2011-06-28 15:40:46 +10:00
David Disseldorp
58c7f5bf00 client: handle transient connection errors
Client connections to the ctdbd unix domain socket may fail
intermittently while the server is under heavy load. This change
introduces a client connect retry loop.

During failure the client will retry for a maximum of 64 seconds, the
ctdb --timelimit option can be used to cap client runtime.

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit dc0c58547cd4b20a8e2cd21f3c8363f34fd03e75)
2011-06-23 15:56:17 +02:00
Mathieu Parent
4a43450968 Manpage for ping_pong
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit af75d3e37412e03d3978073edbe6dee78f265c3c)
2011-06-23 15:56:17 +02:00
Martin Schwenke
5ddc10128a onnode: fix natgwlist nodespec
This hasn't worked for a while if ever.

We treat this case specially because the output has 2 works on the 1st
line.  We also handle the error case where /etc/ctdb_natgw_nodes
exists but none of the other $NATGW_* configuration is done.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 66e89797c7866d207a5bbf1836f52d70dba7cea6)
2011-06-08 14:24:00 +10:00
Martin Schwenke
1ef399e48d onnode: fix get_nodes_with_status()
Setting IFS and looping though items with colons in them doesn't work.
Change this to read through the output line by line.  The header line
needs to be thrown away by throwing away everything up to the 1st
newline.

Keep stderr from the "ctdb status" command, otherwise debugging is
impossible.

On error, append any output from ctdb to onnode's error message.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d60592cf99999f10344a05ef0571fb300bb9d97c)
2011-06-08 14:23:40 +10:00
Martin Schwenke
41436193dd onnode: Remove an unnecessary comment.
The comment about $CTDB_NODES_SOCKETS is meaningless.  The code ti
refers to works just find with $CTDB_NODES_SOCKETS.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 74e69a564bac653dadfffe8b08145b9b3be16e61)
2011-06-08 14:23:14 +10:00
Martin Schwenke
f730194f12 onnode: Future-proof get_nodes_with_status().
The current code requires knowledge of the number of status bits
output by "ctdb status -Y".

This changes the code to be completely general.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e1788f25fde3d1f26bf4831a331741aa280f6fbc)
2011-06-08 14:22:49 +10:00
Martin Schwenke
f3ea7bec68 onnode: Exit with error for unknown command-line flags.
Use of "local" was masking errors in command-line processing.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ca80adda7517b43147ef30156ae34c66b29fa2bd)
2011-06-08 14:22:16 +10:00
Martin Schwenke
350f3e5b09 onnode: Be defensive when listing IPs of nodes with designated status.
The current version gives the last item left after stripping the known
fields.  If an insufficent number of status fields is stripped then
this would return a residual status field value, which turned out to
be a valid IP address for localhost...  so no error occurs.

This change means that the node number is stripped and any residual
status field value will stay appended, causing an error the first time
this command is tested.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 74715e6ec7b67c6f0e863aa51c87279758d6bf91)
2011-06-08 14:21:53 +10:00
Martin Schwenke
597083d37a onnode - Fix long standing bug in onnode healthy/ok/connected/con.
When the output of "ctdb status -Y" changed to add an extra status
column we didn't fix onnode.

This adds a match for the extra column.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 793febaebd3d484ddfbbcb47aaa0cdf3cfc1a00d)
2011-06-08 14:21:26 +10:00
Mathieu Parent
c262fe6a8f Fix bashism
... again ;-)

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 2266586c1839af032622be54dc7f71e39d2bd9ef)
2011-05-14 22:30:25 +02:00
Ronnie Sahlberg
588905c5af Merge branch 'master' of ssh://git.samba.org/data/git/ctdb
(This used to be ctdb commit 307e915459c26a728a1ec16bd735d983d493df53)
2011-05-12 18:58:07 +10:00
Ronnie Sahlberg
d020b2c950 When using multiple VLANs, some funky stuff can sometimes happen when
adding/removing IP addresses causing routes might be dropped by the system.

The easiest workaround for this is to unconditionally try to reapply
all static routes for all interfaces once ipreallocation has finished,
not just adding them back on the affected interface.

This worksaround a funky issue in
CQ S1023538

(This used to be ctdb commit 84600d1f53632d5fe76c308727f31f61b5ec1010)
2011-05-12 12:06:45 +10:00
Michael Adam
4ff1654f29 doc: regenerate ctdb docs
(This used to be ctdb commit 2d67186e5acd5aa8cb3eb1f4fbd4a41153c52e96)
2011-05-12 01:04:54 +02:00
Luk Claes
622fdc1acc doc/ctdb.1.xml: update listvars documentation
Signed-off-by: Luk Claes <luk@debian.org>
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit afd96d5990815019b1f9ddc8b78a05f86eca0421)
2011-05-12 01:04:54 +02:00
Luk Claes
9a91020f97 doc: regenerate ctdb docs
Signed-off-by: Luk Claes <luk@debian.org>
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 39f595cad5321c64e2b1e72fe7b4bbb720f4b906)
2011-05-12 01:04:54 +02:00