1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-28 17:47:29 +03:00

3602 Commits

Author SHA1 Message Date
Martin Schwenke
adf8dbe8c0 IP allocation simulation - add examples.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 8a1ae0c5a3aa788ed0f29c264249ba7bc5d226a7)
2011-07-29 14:32:07 +10:00
Martin Schwenke
85c0b10a38 IP allocation simulation - tighten up termination condition for -x.
When there are IP groups, do not terminate when the overall cluster
goes out of balance.

Also make explicit that grat_ip_moves is an integer not a boolean, so
only terminate if it is greater than 0.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 0899f14b1483682d73d1ee2d2419db54ffeadc4b)
2011-07-29 14:32:07 +10:00
Martin Schwenke
d2ec92ba71 IP allocation simulation - fix documentation for diff() function.
It had out-of-date information and a typo.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5d0d2b8b528414c859da0e6fd5959321db33608b)
2011-07-29 14:32:07 +10:00
Martin Schwenke
c84310e512 IP allocation simulation - add mean imbalance statistics.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b541194d6075e5db72fb691fb79dc81659771cb1)
2011-07-29 14:32:07 +10:00
Martin Schwenke
9f8a781ff1 IP allocation simulation - add -A/--aggressive option.
This is likely to cause many more state changes for nodes.  In this
mode the odds of a failover are applied to determine whether a state
change occurs for each node.  If no state change occurs then the
process is repeated.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b7c42bff9457ec8294b04245af8e3b6010707d1a)
2011-07-29 14:32:07 +10:00
Martin Schwenke
0829e1a22b IP allocation simulation - add LCP2 imbalance metric to node state output.
Print the LCP imbalance metric after the list of IPs.

To make this more sensible, but most of the printing logic into the
Node class.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 2e680e6b421d72cf2d217d3c3c1564da0bb19633)
2011-07-29 14:32:07 +10:00
Martin Schwenke
04196c78c7 IP allocation simulation - add analysis of IP groups.
The public addresses passed to the node constructer can be nested 2
levels.  Each sub-list is an IP group for which separate balance
analysis is done.  However, the public address list is flattened and
the actual IP assignment algorithm doesn't know about IP groups.

This allows extra statistics to be printed and an extra termination
condition to be added for unbalanced IP groups.

Most code from calculate_imbalance() is factored out to a a new
function imbalance_for_ips(), which calculates imbalance for the given
IPs.  calculate_imbalance() now returns the overall imbalance and a
list containing imbalances for each IP group.  To support this
node_ip_coverage() now takes an optional list of IPs to check coverage
within.

This also adds extra output to show statistics for the LCP2 imbalance
metric.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 701395087156b2a5c7be1564897b796df35b69ec)
2011-07-29 14:32:07 +10:00
Martin Schwenke
43e9a7b6c4 IP allocation simulation - add -H/-S options for hard/soft imbalance limit.
An imbalance exceeding the hard limit, as specified by -H (and
defaulting to 1), now causes termination when -x is specified.

Imbalances exceeding the soft limit, as specified by -S (and
defaulting to 1), are counted and printed in the statistics summary.

A side-effect is that imbalances less than 2 are no longer rounded
down to 0, since we want to see them in the stats.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b5e9a4c50eedb8cc786c52af06352788ca25f51e)
2011-07-29 14:32:07 +10:00
Martin Schwenke
2acf892e6e IP allocation simulation - add LCP2 algorithm.
Add -L/--lcp2 option and implement LCP2 algorithm as an alternative to
the basic non-deterministic algorithm.

Existing examples will break if used with LCP2 since it needs real IP
addresses.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 22b14e1a887f0479cc75ed9027af5cc24797f217)
2011-07-29 14:32:07 +10:00
Martin Schwenke
c3bdf4a0a1 IP allocation simulation - options.exit is boolean, so don't compare with 0.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 68a49739763b7125382186504b9cb9b770cfde0e)
2011-07-29 14:32:07 +10:00
Martin Schwenke
cec93a2ffe IP allocation simulation - remove unused function find_least_loaded_node().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4ff3b73b1ccb795fac98b26e038f41f5e32f0d6b)
2011-07-29 14:32:07 +10:00
Martin Schwenke
fb385a2043 IP reallocation simulation - remove --hack option.
The hacks were attempts at improving the deterministic IPs algorithm
but they didn't work.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 6034de0e24438e012f9f1d2065531b1ce467ac52)
2011-07-29 14:32:07 +10:00
Martin Schwenke
540c2cbcfd IP allocation simulation - add debug output using -vv.
-v can now be provided more than once to increase verbosity.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ce4fb56c9972a854bd139429b6f4a26e8d5c3956)
2011-07-29 14:32:07 +10:00
Ronnie Sahlberg
a17ae8a8be Merge branch 'master' of 10.1.1.27:/shared/ctdb/ctdb-master
(This used to be ctdb commit 518945e59e2e48f07fcc0955f3aa81cd0d946aea)
2011-07-29 09:04:01 +10:00
Martin Schwenke
5ac67504ca Tests: Initial test code for LCP2 IP allocation algorithm.
Move struct ctdb_public_ip_list to ctdb_private.h and put some
definitions for some functions from ctdb_takeover.c there.  This
allows those functions to be called from unit tests.

Add ctdb_takeover_tests.c and the Makefile support to build it.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9d34be0233edf3bc022345c0494c4b2a4d7f8480)
2011-07-29 09:01:36 +10:00
Martin Schwenke
ff1a81c872 IP allocation - add LCP2 algorithm.
The current non-deterministic IP allocation algorithm balances IPs
across the whole cluster.  It does not consider different
interfaces/VLANs/subnets, so these different groups of IPs aren't
generally well balanced.

This adds the LCP2 algorithm for IP allocation and allows it to be
enabled by setting the "LCP2PublicIPs" tunable to 1.

The LCP2 algorithm calculates the imbalance of a node by totalling the
squares of the distances between each IP on the node.  The IP distance
is defined as the length longest common prefix (LCP) of bits that is
found when comparing 2 IPs.  The imbalance of a cluster is the maximum
imbalance for any node.  At each step the algorithm selects an
allocation to the IP/node combination that results in the choosing the
allocation that best reduces the imbalance of the cluster.

The implementation splits out the IP allocation part of
ctdb_takeover_run() into new function ctdb_takeover_run_core(), and
then extracts out the basic IP assignment code into new functions
basic_allocate_unassigned() and basic_failback().  3 new functions
lcp2_init(), lcp2_allocate_unassigned() and lcp2_failback() implement
the LCP2 algorithm, and are hooked into ctdb_takeover_run_core().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 61fc7fbd0235469df22deb6581c6bd47e30bc0be)
2011-07-29 09:01:17 +10:00
Ronnie Sahlberg
a5cd8a3270 Merge branch 'master' of 10.1.1.27:/shared/ctdb/ctdb-master
(This used to be ctdb commit 0e60a738f9a6275ed45abc3d933f872d93132d92)
2011-07-29 08:53:43 +10:00
Ronnie Sahlberg
e707f23596 Update the delip command
Dont talloc_free(vnn) immediately but postphone it until later when
the eventscript callback has completed.

CQ S1026664

(This used to be ctdb commit 0a99e8742a261b1d3a2c8830f5c19ea6c2c47cad)
2011-07-29 08:50:48 +10:00
Rusty Russell
87ea4818bf eventscript: fix callback after free
ctdb_event_script_callback() takes a mem_ctx arg which it doesn't use, but
the implication is pretty clear, that when that mem_ctx is freed, the callback
shouldn't happen.  Indeed, Ronnie reproduced a case where that callback
refers to freed memory, in the ip reallocation code under stress.

So attach the callback to the mem_ctx they give us, and remove it from the
script state structure when that's freed.  It's a bit weird, but it works.

CQ: S1026179
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 6fcd867cc835ef1ffc1c50964f135c346503d40c)
2011-07-29 08:50:39 +10:00
Michael Adam
fe114ca5d5 packaging: honour rpm build target options handed in to makerpms.sh
This allows to call e.g. "makerpms.sh -bs" to build only the source RPM.

(This used to be ctdb commit c6bfba2bb66962b7b05d708f0747002700991472)
2011-07-22 15:48:27 +02:00
Ronnie Sahlberg
69024f40bf Merge branch 'master' of ssh://git.samba.org/data/git/ctdb
(This used to be ctdb commit a1b3661973489f0111e7975fec422fb99a25f0c8)
2011-07-20 15:53:11 +10:00
Ronnie Sahlberg
0a86d6ed91 Add a text about "ban" "unban" not being permanent and htat recovery daemon can auto unban nodes. Suggest using "stop" / "continue" instead.
(This used to be ctdb commit 8e30dffad5b1385818b2d7350d6c3767a220d745)
2011-07-09 07:14:32 +10:00
Michael Adam
926a6d7f05 web: correctly terminate list items <li> with </li> instead of with <br>
(This used to be ctdb commit 3f698e69a56305c5ec27b8d119bf2d57d5cd2ec6)
2011-07-08 10:07:42 +02:00
Michael Adam
577f58a8be web: add Stefan Metzmacher to the list of CTDB developers.
(This used to be ctdb commit 912a33cebe7c51b33cda2e6d5f2b3a481fa7fd49)
2011-07-08 10:06:14 +02:00
Ronnie Sahlberg
c93a968619 When trying to re-balance the ip assignment and shuffle ips from
nodes with many addresses to nodes with few addresses,
loop up to num_ips+5 times instead of only 5 times.

When we have very many public ips per node, we might need to loop more than
5 times or else we will exit without reaching optimal balance.

(This used to be ctdb commit aa8114a625a637277561a66c80bdece3c27e9e20)
2011-07-06 13:14:13 +10:00
Ronnie Sahlberg
20a7c19691 Add log output to wipedb and backupdb
CQ S1025379

(This used to be ctdb commit 6f51d4a75f8a9f2cdb8ecde946ed31809ab5a415)
2011-07-06 13:13:18 +10:00
Ronnie Sahlberg
18af72f08f change the name for the key for the record where we stoire the public address config from public-addresses... to public_addresses...
CQ1019030

(This used to be ctdb commit 114d5034ff4880848588caf493382a537a1469ae)
2011-06-28 15:40:46 +10:00
David Disseldorp
58c7f5bf00 client: handle transient connection errors
Client connections to the ctdbd unix domain socket may fail
intermittently while the server is under heavy load. This change
introduces a client connect retry loop.

During failure the client will retry for a maximum of 64 seconds, the
ctdb --timelimit option can be used to cap client runtime.

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit dc0c58547cd4b20a8e2cd21f3c8363f34fd03e75)
2011-06-23 15:56:17 +02:00
Mathieu Parent
4a43450968 Manpage for ping_pong
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit af75d3e37412e03d3978073edbe6dee78f265c3c)
2011-06-23 15:56:17 +02:00
Martin Schwenke
5ddc10128a onnode: fix natgwlist nodespec
This hasn't worked for a while if ever.

We treat this case specially because the output has 2 works on the 1st
line.  We also handle the error case where /etc/ctdb_natgw_nodes
exists but none of the other $NATGW_* configuration is done.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 66e89797c7866d207a5bbf1836f52d70dba7cea6)
2011-06-08 14:24:00 +10:00
Martin Schwenke
1ef399e48d onnode: fix get_nodes_with_status()
Setting IFS and looping though items with colons in them doesn't work.
Change this to read through the output line by line.  The header line
needs to be thrown away by throwing away everything up to the 1st
newline.

Keep stderr from the "ctdb status" command, otherwise debugging is
impossible.

On error, append any output from ctdb to onnode's error message.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d60592cf99999f10344a05ef0571fb300bb9d97c)
2011-06-08 14:23:40 +10:00
Martin Schwenke
41436193dd onnode: Remove an unnecessary comment.
The comment about $CTDB_NODES_SOCKETS is meaningless.  The code ti
refers to works just find with $CTDB_NODES_SOCKETS.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 74e69a564bac653dadfffe8b08145b9b3be16e61)
2011-06-08 14:23:14 +10:00
Martin Schwenke
f730194f12 onnode: Future-proof get_nodes_with_status().
The current code requires knowledge of the number of status bits
output by "ctdb status -Y".

This changes the code to be completely general.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e1788f25fde3d1f26bf4831a331741aa280f6fbc)
2011-06-08 14:22:49 +10:00
Martin Schwenke
f3ea7bec68 onnode: Exit with error for unknown command-line flags.
Use of "local" was masking errors in command-line processing.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ca80adda7517b43147ef30156ae34c66b29fa2bd)
2011-06-08 14:22:16 +10:00
Martin Schwenke
350f3e5b09 onnode: Be defensive when listing IPs of nodes with designated status.
The current version gives the last item left after stripping the known
fields.  If an insufficent number of status fields is stripped then
this would return a residual status field value, which turned out to
be a valid IP address for localhost...  so no error occurs.

This change means that the node number is stripped and any residual
status field value will stay appended, causing an error the first time
this command is tested.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 74715e6ec7b67c6f0e863aa51c87279758d6bf91)
2011-06-08 14:21:53 +10:00
Martin Schwenke
597083d37a onnode - Fix long standing bug in onnode healthy/ok/connected/con.
When the output of "ctdb status -Y" changed to add an extra status
column we didn't fix onnode.

This adds a match for the extra column.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 793febaebd3d484ddfbbcb47aaa0cdf3cfc1a00d)
2011-06-08 14:21:26 +10:00
Mathieu Parent
c262fe6a8f Fix bashism
... again ;-)

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 2266586c1839af032622be54dc7f71e39d2bd9ef)
2011-05-14 22:30:25 +02:00
Ronnie Sahlberg
588905c5af Merge branch 'master' of ssh://git.samba.org/data/git/ctdb
(This used to be ctdb commit 307e915459c26a728a1ec16bd735d983d493df53)
2011-05-12 18:58:07 +10:00
Ronnie Sahlberg
d020b2c950 When using multiple VLANs, some funky stuff can sometimes happen when
adding/removing IP addresses causing routes might be dropped by the system.

The easiest workaround for this is to unconditionally try to reapply
all static routes for all interfaces once ipreallocation has finished,
not just adding them back on the affected interface.

This worksaround a funky issue in
CQ S1023538

(This used to be ctdb commit 84600d1f53632d5fe76c308727f31f61b5ec1010)
2011-05-12 12:06:45 +10:00
Michael Adam
4ff1654f29 doc: regenerate ctdb docs
(This used to be ctdb commit 2d67186e5acd5aa8cb3eb1f4fbd4a41153c52e96)
2011-05-12 01:04:54 +02:00
Luk Claes
622fdc1acc doc/ctdb.1.xml: update listvars documentation
Signed-off-by: Luk Claes <luk@debian.org>
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit afd96d5990815019b1f9ddc8b78a05f86eca0421)
2011-05-12 01:04:54 +02:00
Luk Claes
9a91020f97 doc: regenerate ctdb docs
Signed-off-by: Luk Claes <luk@debian.org>
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 39f595cad5321c64e2b1e72fe7b4bbb720f4b906)
2011-05-12 01:04:54 +02:00
Luk Claes
15f92f814d doc/ctdb.1.xml: Fix typo
s/poerwoff/poweroff/

Bug 8124

Signed-off-by: Luk Claes <luk@debian.org>
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit a6d2f1bd552dba33640acb7a0b8110534debd4ce)
2011-05-12 01:04:54 +02:00
Ronnie Sahlberg
5b93e0a870 Remove all checking of GPFS from ctdb_diagnostics
CQ S1023524

(This used to be ctdb commit 4cddba08b46db0a56a86b32403a41b89cd097317)
2011-05-11 21:25:25 +10:00
Ronnie Sahlberg
d1edf44e4f If samba fails to start for some reason, make this cause the startup event to fail too, so that ctdbd will re-try the startup event later.
Or else this will leave samba not running.

CQ S1023394

(This used to be ctdb commit f90485b08d32cbe56050718a3b28ca0fe1d64e0f)
2011-05-10 09:59:38 +10:00
Ronnie Sahlberg
ee9e137759 Dont exit from checking interfaces once we have found one interface that is not
in use by public addresses.   this can happen when we have removed existing interfaces/ip addresses and prevents us from verifying the status of other interfaces

(This used to be ctdb commit d67955b42f7627be9dae995230c8fcbb8a948ec2)
2011-05-10 07:53:43 +10:00
Ronnie Sahlberg
2e2e37fdd6 Remove logging of spam/errors from the 10.interfrace
script if/when we have for example NATGW configured but no public addresses defined on that interface

CQ S1023378

(This used to be ctdb commit 8837daa424732aeb5a20814b1709c345a97a0e09)
2011-05-09 08:10:49 +10:00
Michael Adam
0b80a7618d packaging: add ltdbtool and its manpage to the RPM
(This used to be ctdb commit ce6409dc7d059701f0fe4b57e7c05c38c66629c5)
2011-05-04 14:40:13 +02:00
Michael Adam
3c82a043b1 install the ltdbtool manpage with "make install"
(This used to be ctdb commit ffbff1affed8301831387e23b4f8f824d9f78e20)
2011-05-04 14:40:12 +02:00
Michael Adam
f3066d7fb4 install ltdbtool with "make install"
(This used to be ctdb commit 991ea66e5ed0eb7ab256dc8e3118dc78462d4752)
2011-05-04 14:40:12 +02:00