1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-23 17:34:34 +03:00
Commit Graph

430 Commits

Author SHA1 Message Date
Martin Schwenke
1ae731198a recoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c
This is an internal structure.  It was moved into ctdb_private.h a
long time ago to allow unit testing.  Unit test compilation was
changed shortly afterwards to make this unnecessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit db57261d7dc264e161659a8c547f44fbd9e88eeb)
2013-08-22 17:00:20 +10:00
Martin Schwenke
a5cb72cac3 ctdbd: Kill client process without checking for tracked child
Commit f73a4b1495830bcdd094a93732a89dd53b3c2f78 added a safety check
to ensure that CTDB never kills unrelated processes.  However, client
processes are unrelated.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 782814288bb560099ee44b607bf35f3eddf37f82)
2013-07-29 15:58:51 +10:00
Martin Schwenke
f46ab595d1 recoverd: Call takeover fail callback only once per node
Currently the fail callback is called once per (takeip/releaseip) control
failure.  This is overkill and can get a node banned much too quickly.

Instead, keep track of control failures per node and only call fail
callback once per failed node.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit bf4a7c1ad87e0e848296d15d63eb8cd901ca5335)
2013-07-29 15:48:48 +10:00
Amitay Isaacs
1c21f37e57 ctdbd: Set process names for child processes
This helps distinguish processes in process list in top, perf, etc.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 2493f57ce268d6fe7e4c40a87852c347fd60d29e)
2013-07-10 14:33:19 +10:00
Amitay Isaacs
bcb64aa55f recoverd: Fix buffer overflow error in reloadips
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 41182623891d74a7e9e9c453183411a161201e67)
2013-07-05 15:52:34 +10:00
Martin Schwenke
dcdae86dc7 ctdbd: Log something when releasing all IPs
At the moment this is silent and it can be confusing to see IPs just
disappear.

Also, this message:

  Been in recovery mode for too long. Dropping all IPS

can cause anxiety when all IPs should already have been dropped.
Adding a comforting message saying that 0 IPs were dropped relieves
such anxiety.  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4d0f26b306fc465d551d340b0e7dce4412eae3fd)
2013-07-05 15:52:33 +10:00
Martin Schwenke
7290798a41 recoverd: Clean up log messages in remote IP verification
The log messages in verify_remote_ip_allocation() are confusing
because they don't include the PNN of the problem node, because it is
not known in this function.

Add the PNN of the node being verified as a function argument and then
shuffle the log messages around to make them clearer.

Also fold 3 nested if statements into just one.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f0942fa01cd422133fc9398f56b4855397d7bc86)
2013-07-05 15:52:33 +10:00
Martin Schwenke
26b161156a ctdbd: Release IP callback should fail if the IP is still hosted
At the moment there (at least) are 2 bugs that cause rogue IPs:

* A race where release_ip_callback() runs after a "subsequent" take IP
  has completed.  The IP is back on an interface but we unset
  vnn->iface in the callback.

* A "releaseip" eventscript times out.  We ignore the timeout and call
  it success, deleting the VNN even if the IP is still hosted.

  We could decide not to ignore the timeout and ban the node, but
  killing TCP connections can take a long time and that might result
  in a lot of manning.  We probably won't reinstate banning on
  "releaseip" until killing TCP connections has been optimised.

In both cases, a rogue IP can be avoided by leaving vnn->iface set and
simply failing the control.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit c5797f2942e83da24df548ea07196fbbac0eab20)
2013-07-05 15:52:32 +10:00
Martin Schwenke
793233f6b6 ctdbd: Log warnings in release IP when unexpected interface is encountered
Previous code changes work around a potential problems but do not
provide useful information when the a problem occurs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit f1f1b0c24b9b6cd24b83a4e4da16e179287ec6ac)
2013-07-05 15:52:32 +10:00
Amitay Isaacs
6391f61fbc build: Fix compiler warnings for uninitialized variables
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 5408c5c4050539e5aa06a5e82ceb63a6cb5cef0c)
2013-07-04 20:43:52 +10:00
Mathieu Parent
d82b9ae410 build: Fix tdb.h path to enable building with system TDB library
(This used to be ctdb commit f8bf99de3a5f56be67aaa67ed836458b1cf73e86)
2013-06-14 16:45:27 +10:00
Martin Schwenke
1ab2bbb349 recoverd: Backward compatibility for nodes without IPREALLOCATED control
Consider the case of upgrading a cluster node by node, where some
nodes are still running older versions of CTDB without the
IPREALLOCATED control.  If a "new" node takes over as recovery master
and a failover occurs, then it will attempt to send IPREALLOCATED
controls to all nodes.  The "old" nodes will fail in a fairly
nondescript way (result == -1).

To try to handle this situation, fall back to the EVENTSCRIPT control
to handle "ipreallocated".  Only do this on the failed nodes.
However, do not do this on nodes that timed out (they've probably
implemented the control and we should call the regular fail_callback
to get those nodes banned) or for stopped nodes (since they can't
actually run the "ipreallocated" event via the EVENTSCRIPT control).

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b2654853ce9b7c18c5874b080bc94d3118078a5d)
2013-05-27 15:15:25 +10:00
Martin Schwenke
f35e9bba9b recoverd: Nodes can only takeover IPs if they are in runstate RUNNING
Currently the order of the first IP allocation, including the first
"ipreallocated" event, and the "startup" event is undefined.  Both of
these events can (re)start services.

This stops IPs being hosted before the "startup" event has completed.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit f15dd562fd8c08cafd957ce9509102db7eb49668)
2013-05-24 16:27:55 +10:00
Martin Schwenke
7f03618ae4 recoverd: Handle errors carefully when fetching tunables
If a tunable is not implemented on a remote node then this should not
be fatal.  In this case the takeover run can continue using benign
defaults for the tunables.

However, timeouts and any unexpected errors should be fatal.  These
should abort the takeover run because they can lead to unexpected IP
movements.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c0c27762ea728ed86405b29c642ba9e43200f4ae)
2013-05-24 16:27:55 +10:00
Martin Schwenke
116f62a7b3 recoverd: Set explicit default value when getting tunable from nodes
Both of the current defaults are implicitly 0.  It is better to make
the defaults obvious.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 1190bb0d9c14dc5889c2df56f6c8986db23d81a1)
2013-05-24 16:04:57 +10:00
Martin Schwenke
e78b064dcc recoverd: Whitespace improvements
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 473cfcb019f0cb4a094bf10397f7414f7923ee57)
2013-05-24 15:55:11 +10:00
Martin Schwenke
1a181a4284 recoverd: Use talloc_array_length() for simpler code
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f6792f478197774d2f3b2258c969b67c83e017ab)
2013-05-24 15:55:10 +10:00
Martin Schwenke
63577c96db ctdbd: Replace ctdb->done_startup with ctdb->runstate
This allows states, including startup and shutdown states, to be
clearly tracked.  This doesn't include regular runtime "states", which
are handled by node flags.

Introduce new functions ctdb_set_runstate(), runstate_to_string() and
runstate_from_string().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28)
2013-05-24 14:08:06 +10:00
Martin Schwenke
5fdf71b898 recoverd: takeover_run_core() should not use modified node flags
Modifying the node flags with IP-allocation-only flags is not
necessary.  It causes breakage if the flags are not cleared after use.
ctdb_takeover_run() no longer needs the general node flags - it only
needs the IP flags.

Instead of modifying the node flags in nodemap, construct a custom IP
flags list and have takeover_run_core() use that instead of node
flags.  As well as being safer, this makes the IP allocation code more
self contained and a little bit clearer.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 14bd0b6961ef1294e9cba74ce875386b7dfbf446)
2013-05-23 16:18:23 +10:00
Martin Schwenke
e769f8575a ctdbd: Log add and delete of IPs
At the moment, when someone deletes all the IPs on a node, all we see
are the release IP messages and we have to guess why.

Some would argue that add/release are more significant than
take/release so they should be logged.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 3c3df1d6afec7e3e721f9bcd4e8b8e008fd6e50b)
2013-05-22 14:24:22 +10:00
Martin Schwenke
0baefba368 ctdbd: Removed bogus comment in ctdb_find_iface()
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4a8d90d0812a3242f58a2a0e2aa0f528f60f7013)
2013-05-22 14:24:21 +10:00
Martin Schwenke
54e91df60d recoverd: Move IP flags into ctdb_takeover.c
These should never be seen outside the IP allocation code.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e143abd16ccde2e0edfe103673d31a5fb06b6aef)
2013-05-09 12:55:42 +10:00
Martin Schwenke
50f19b5bd4 recoverd: Clear IP flags after IP allocation algorithm has run
If these flags are left set they will confuse other recovery daemon
code.

Factor the clearing code into new function clear_ipflags().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 45c776958017ea7001f061842c9e0f60e4a25f23)
2013-05-09 12:55:42 +10:00
Martin Schwenke
530020d83b recoverd: Remove unused mask argument and initial mask calculation
This has been replaced by set_ipflags() and associated functionality.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d0a3822573db296e73cc897835f783c8abc084b3)
2013-05-07 16:20:47 +10:00
Martin Schwenke
ee7357de51 recoverd: When calculating rebalance candidates don't consider flags
This is really a check to see if a node is already hosting IPs.  If
so, we assume it was previously healthy so it isn't considered as a
rebalance candidate.  There's no need to limit this to healthy node,
since this is checked elsewhere.

Due to this the variable newly_healthy is renamed everywhere to
rebalance_candidates.

The mask argument is now completely unused.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 65e0ea6c2c0629e19349ba4b9affa221fde2b070)
2013-05-07 16:20:47 +10:00
Martin Schwenke
c9056b4f88 recoverd: Remove unused mask argument from IP allocation functions
This is a no-op and is in a separate commit to make the previous
commit less cumbersome.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 107e656bbe24f9d21fbaf886a3e9417da4effe5a)
2013-05-07 16:20:47 +10:00
Martin Schwenke
0445c988e2 recoverd: Fix tunable NoIPTakeoverOnDisabled, rename to NoIPHostOnAllDisabled
This really needs to be per-node.  The rename is because nodes with
this tunable switched on should drop IPs if they become unhealthy (or
disabled in some other way).

* Add new flag NODE_FLAGS_NOIPHOST, only used in recovery daemon.

* Enhance set_ipflags_internal() and set_ipflags() to setup
  NODE_FLAGS_NOIPHOST depending on setting of NoIPHostOnAllDisabled
  and/or whether nodes are disabled/inactive.

* Replace can_node_servce_ip() with functions can_node_host_ip() and
  can_node_takeover_ip().  These functions are the only ones that need
  to look at NODE_FLAGS_NOIPTAKEOVER and NODE_FLAGS_NOIPHOST.  They
  can make the decision without looking at any other flags due to
  previous setup.

* Remove explicit flag checking in IP allocation functions (including
  unassign_unsuitable_ips()) and just call can_node_host_ip() and
  can_node_takeover_ip() as appropriate.

* Update test code to handle CTDB_SET_NoIPHostOnAllDisabled.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 1308a51f73f2e29ba4dbebb6111d9309a89732cc)
2013-05-07 16:20:46 +10:00
Martin Schwenke
ac80824709 recoverd: Factor out new function all_nodes_are_disabled()
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 12aef10e9889760d98f58c8d916f19d069fa381a)
2013-05-07 16:20:46 +10:00
Martin Schwenke
657162fb34 recoverd: Refactor code to get NoIPTakeover tunable from all nodes
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 1fb5352d2b6918fcc6f630db49275d25a3eebe8d)
2013-05-07 16:20:46 +10:00
Martin Schwenke
17521b31b2 recoverd: Add debug message when dropping IPs in IP allocation
Update tests accordingly.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 91405282ba4abad4ad8e8c5f7ee4c83c75f38280)
2013-05-07 16:20:46 +10:00
Martin Schwenke
745c6bc363 recoverd: ctdb_takeover_run() uses CTDB_CONTROL_IPREALLOCATED
This means "ipreallocated" is now run on stopped nodes.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 83b61f7414b1f7a3424497ac987ca0724fba9eaa)
2013-05-06 13:38:21 +10:00
Martin Schwenke
2e59cd5428 ctdbd: New control CTDB_CONTROL_IPREALLOCATED
This is an alternative to using ctdb_run_eventscripts() that can be
used when in recovery.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 27a44685f0d7a88804b61a1542bb42adc8f88cb1)
2013-05-06 13:38:21 +10:00
Amitay Isaacs
77a29b3733 recoverd/takeover: Use IP->node mapping info from nodes hosting that IP
When collating IP information for IP layout, only trust the nodes that are
hosting an IP, to have correct information about that IP.  Ignore what all the
other nodes think.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 1c7adbccc69ac276d2b957ad16c3802fdb8868ca)
2013-04-08 11:14:32 +10:00
Martin Schwenke
53bd183683 recoverd: Separate each IP allocation algorithm into its own function
This makes the code much more readable and maintainable.

As a side effect, fix a memory leak in LCP2.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 6a1d88a17321f7e1dc84b4823d5e7588516a6904)
2013-01-08 10:16:11 +11:00
Martin Schwenke
2e8df43561 recoverd: New function unassign_unsuitable_ips()
Move the code into a new function so it can be called from a number of
places.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 8adb255e62dbe60d1e983047acd7b9c941231d11)
2013-01-08 10:16:11 +11:00
Martin Schwenke
bcefb76884 recoverd: Move failback retry loop into basic_failback() and lcp2_failback()
The retry loop is currently in ctdb_takeover_run_core().  Pushing it
into each function will make it possible to put each algorithm into a
separate top-level function.  This will make the code much clearer and
more maintainable.

Also keep associated test code compatible.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f6ce18d011dd9043b04256690d826deb2640cd89)
2013-01-08 10:16:11 +11:00
Martin Schwenke
443fbb9e01 recoverd: Trying to failback more IPs no longer allocates unassigned IPs
Neither basic_failback() nor lcp2_failback() unassign IPs anymore, so
there's no point looping back that far.

Also fix a unit test that now fails because looping back to handle
unassigned IPs is no longer logged.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c09aeaecad7d3232b1c07bab826b96818756f5e0)
2013-01-08 10:16:11 +11:00
Martin Schwenke
dfa7ce7b73 recoverd: basic_failback() can call find_takeover_node() directly
Instead of unassigning, looping back and depending on
basic_allocate_unassigned.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4dc08e37dec464c8785a2ddae15c7c69d3c81ac3)
2013-01-08 10:16:11 +11:00
Martin Schwenke
326328d520 recoverd: Don't do failback at all when deterministic IPs are in use
This seems to be the right thing to do instead of calling into the
failback code and continually skipping the release of an IP.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4c87e7cb3fa2cf2e034fa8454364e0a7fe0c8f81)
2013-01-08 10:16:11 +11:00
Martin Schwenke
ef403f70f2 recoverd: Move the test for both 'DeterministicIPs' and 'NoIPFailback' set
If this is done earlier then some other logic can be improved.  Also,
this should be a warning since no error condition is set.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e06476e07197b7327b8bdac9c0b2e7281798ffec)
2013-01-08 10:16:11 +11:00
Martin Schwenke
a3911ed7bf recoverd: Fix a memory leak in IP allocation
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit bcd5f587aff3ba536cb0b5ef00d2d802352bae25)
2013-01-08 10:16:11 +11:00
Martin Schwenke
4f0d68cba6 ctdbd: Clean up orphaned interfaces when an IP is deleted
Add a new function ctdb_remove_orphaned_ifaces() and call it in
ctdb_control_del_public_address().

ctdb_remove_orphaned_ifaces() uses a naive implementation that does
things in a very obvious way.  There are many ways to improve the
performance - some are mentioned in a comment in the code.  However, I
doubt that this will be a bottleneck even with a large number of
public IPs.  Running the eventscript is likely to outweigh the cost of
this cleanup.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit cc1a3ae911d3fee8b87fda5de5ab6d9499d7510a)
2013-01-07 12:19:33 +11:00
Martin Schwenke
0f1bcebc80 ctdbd: Make the link status of new interfaces more flexible
Neither up nor down is a good default value for the link status of a
new interface.  Up means that IPs can be assigned to interfaces before
the true state is known and they can move away quickly if the interface
is actually down.  Down means that IPs can't be assigned to an interface
for a variable amount of time - until a monitor cycle occurs - and this
can result in imbalanced IPs.

This is a neat compromise.  Before the startup event completes, IPs
can't be assigned to interfaces because all interfaces begin in a down
state.  As soon as the startup event completes, IPs can be allocated
to any interface that has been marked up by the eventscript.  Later,
during normal operation, newly added IPs can be assigned to new
interfaces immediately.  The IPs will still move away if an interface
is noticed to be down in the next monitor cycle, but that is the
exception rather than the rule.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9275a69a414482f1053ae14528d5972575b9214e)
2012-11-19 15:53:13 +11:00
Amitay Isaacs
85c8deca3f recoverd: Track the nodes that fail takeover run and set culprit count
If any of the nodes fail takeover run (either due to timeout or failure
to complete within takeover_timeout interval) from main loop, recovery
master will give up trying takeover run with following message:

  "Unable to setup public takeover addresses. Try again later"

And as a side-effect the monitoring is disabled on all the nodes. Before
ctdb_takeover_run() is called from main loop, monitoring get disabled via
startrecovery event. Since ctdb_takeover_run() fails, it never runs
recovered event and monitoring does not get re-enabled.

In main_loop, ctdb_takeover_run() is called with a takeover_fail_callback.
This callback will get called if any of the nodes fail in handling
takeip/releaseip/ipreallocated events in ctdb_takeover_run().

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit a5c6bb1fffb8dc3960af113957a1fd080cc7c245)
2012-11-14 10:59:54 +11:00
Martin Schwenke
62046a8a4c recoverd: When starting a takeover run disable IP verification
Disable for TakeoverTimeout seconds.

Otherwise the the recovery daemon can get overzealous and start trying
to add/delete addresses that it thinks are missing but where the
eventscript just hasn't finished.  This didn't used to matter so much
but it is more important now that concurrent takeip/releaseip/updateip
generate error - we want to avoid spamming the log.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 56fcee3c7730cb12fa666072d5400949af6e5f7c)
2012-10-11 12:10:45 +11:00
Martin Schwenke
4b4e4d8870 ctdbd: Stop takeovers and releases from colliding in mid-air
There's a race here where release and takeover events for an IP can
run at the same time.  For example, a "ctdb deleteip" and a takeover
initiated by the recovery daemon.  The timeline is as follows:

1. The release code registers a callback to update the VNN.  The
   callback is executed *after* the eventscripts run the releaseip
   event.

2. The release code calls the eventscripts for the releaseip event,
   removing IP from its interface.

   The takeover code "updates" the VNN saying that IP is on some
   iface.... even if/though the address is already there.

3. The release callback runs, removing the iface associated with IP in
   the VNN.

   The takeover code calls the eventscripts for the takeip event,
   adding IP to an interface.

As a result, CTDB doesn't think it should be hosting IP but IP is on
an interface.  The recovery daemon fixes this later... but it
shouldn't happen.

This patch can cause some additional noise in the logs:

  Release of IP 10.0.2.133/24 on interface eth2  node:2
  recoverd:We are still serving a public address '10.0.2.133' that we should not be serving. Removing it.
  Release of IP 10.0.2.133/24 rejected update for this IP already in flight
  recoverd:client/ctdb_client.c:2455 ctdb_control for release_ip failed
  recoverd:Failed to release local ip address

In this case the node has started releasing an IP when the recovery
daemon notices the addresses is still hosted and initiates another
release.  This noise is harmless but annoying.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit bfe16cf69bf2eee93c0d831f76d88bba0c2b96c2)
2012-10-11 12:10:45 +11:00
Martin Schwenke
79ea15bf96 ctdbd: New tunable NoIPTakeoverOnDisabled
Stops the behaviour where unhealthy nodes can host IPs when there are
no healthy nodes.  Set this to 1 when an immediate complete outage is
preferred when all nodes are unhealthy.  The alternative
(i.e. default) can lead to undefined behaviour when the shared
filesystem is unavailable.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a555940fb5c914b7581667a05153256ad7d17774)
2012-10-11 12:10:45 +11:00
Martin Schwenke
9aa9abcc19 ctdbd: Avoid unnecessary updateip event
The existing code makes one fatally bad assumption:
vnn->iface->references can never be -1 (or max-unit32_t in this case).
Right now the reference counting is broken so a reference count of -1
is possible and causes a spurious updateip when vnn->iface is the same
as best_face.  This can occur frequently because we get a lot of
redundant takeovers, especially when each IP can only be hosted on one
interface.

This makes the code much more defensive by noting that when best_iface
is the same as vnn->iface there is never a need for an updateip event.
This effectively neuters the updateip code path when IPs can only be
hosted by a single interface.

This should obsolete 6a74515f0a1e24d97cee3ba05d89133aac7ad2b7.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 7054e4ded59c6b8f254dcfefaef64da05f25aecd)
2012-10-10 14:54:53 +11:00
Amitay Isaacs
3c1f656764 Revert "when creating/adding a public ip, set the initial interface to be the first interface specified"
This reverts commit 4308935ba48ac7a29e7523315acf580019715f0f.

This fixes 16_ctdb_config_add_ip.sh test when run against local daemons. When
running against local daemons, if the interface is assigned as soon as an IP is
added, then takeover would never assign this IP address.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 06dfd13604d08910e07cbf927c338d7b9fce9a2f)
2012-10-07 15:25:34 +11:00
Martin Schwenke
7df1da1c91 recoverd: Update a log message that has bit-rotted
This message used to be correct because the ipreallocated event only
handled updating the NAT gateway.  However, that has changed so the
message needs to be updated.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit cc9d96f4248e45ea99c5f00db1526426ac26fbc2)
2012-08-08 16:11:11 +10:00
Martin Schwenke
75a0041567 ctdbd: Fix ctdb_control_release_ip() on local daemons
When running on local daemons no IPs are actually assigned to
interfaces.  Commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e broke
ctdb_control_release_ip() for local daemons because it asks the system
which interface the given IP is on, instead of the old behaviour of
trusting CTDB's internal records.

For local deamons (i.e. !ctdb->do_checkpublicip) revert to the old
behaviour of looking up the interface internally.  This is good
enough, given that the tests don't tend to misconfigure the addresses.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 38e8651b955afdbaf0ae87c24c55c052f8209290)
2012-07-26 22:10:54 +10:00
Amitay Isaacs
e379fc3ea5 Fix compiler warnings.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit d29e1880c8ce7219e065d31b47b0e8ad9e83146d)
2012-07-13 14:50:56 +10:00
Ronnie Sahlberg
c7e648c2d1 When we release an ip, get the interface name from the kernel
instead of using the interface where ctdb thinks the ip is hosted at.
The difference is that this now allows us to handle cases where we want to release an ip   but ctdbd does not know which interface the ip is assigned on.
(user has used 'ip addr add...'  and manually assigned an ip to the wrong interface)

(This used to be ctdb commit c6bf22ba5c01001b7febed73dd16a03bd3fd2bed)
2012-06-20 15:11:56 +10:00
Amitay Isaacs
7631830152 server: Replace BOOL datatype with bool, True/False with true/false
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 6e5cbe8fff71985e5a2fc16b7e9f2b868011ff5d)
2012-05-28 11:22:25 +10:00
Ronnie Sahlberg
a57eba2bb4 Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process
Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned.
Capture SIGCHLD to track also which child processes have terminated.

Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a

(This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)
2012-05-03 14:03:26 +10:00
Ronnie Sahlberg
a367fa6138 RELOADIPS: simplify the reloadips code a bit
and also update the "read public address file" to not check if the address exists already locally when we read if from the child process, to stop it
from spamming the logs with "We already host ..."
messages

(This used to be ctdb commit 334ea830f1bf33419f4a1e78f23afd41a852d0f4)
2012-05-01 15:34:26 +10:00
Ronnie Sahlberg
7a1aa560e7 Add new control to reload the public ip address file on a node
Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster.
Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy.

(This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0)
2012-05-01 10:48:08 +10:00
Ronnie Sahlberg
db411aaada Merge remote branch 'amitay/tevent-sync'
(This used to be ctdb commit 17ff3f240b0d72c72ed28d70fb9aeb3b20c80670)
2012-04-26 08:09:23 +10:00
Amitay Isaacs
4392591555 Remove explicit include of lib/tevent/tevent.h.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 0681014ca5ed2a9b56f63fdace7f894beccf8a9a)
2012-04-13 17:28:14 +10:00
Amitay Isaacs
b3d098ced7 ctdbd: Fix spurious warnings when running with --nopublicipcheck
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 67b909a0718d6cfce82ffce0830da3a6ff1f6c4b)
2012-04-13 15:38:11 +10:00
Amitay Isaacs
425b8768ee ctdbd: Fix the error message string
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 15f63ebab9686734f41a6adf38d4a7faa919ac66)
2012-04-13 14:51:13 +10:00
Ronnie Sahlberg
2456f77ca6 NoIPTakeover: change the tunable name for the "dont allow failing addresses over onto the node" to NoIPTakeover
(This used to be ctdb commit 35592e618cfd827b6978af6332f80504f232c46a)
2012-03-22 11:05:15 +11:00
Ronnie Sahlberg
9f31f76805 NoIPFailback: Exclude nodes which have NoIPFailback as failback targets during reallocation
(This used to be ctdb commit c262c29773d1608e7ce04bdfb7f4469df0a9637b)
2012-03-22 09:24:32 +11:00
Ronnie Sahlberg
befa9df152 Make NoIPFailback a node local setting. Nodes that have NoIPFailback set to !0 can not takeover new ip addresses during failover.
Remove the old global setting for this unused tunable and add it as a new node flag. This node flag is only valid/defined within the takeover subsystem in the recovery daemon. Add async functions to collec the NoIPFailback settings for each node.

This will later e used to disqualify certain nodes from being takeover targets when we perform reallocation.

(This used to be ctdb commit 668f3e88a9e5f598706952b7140547640c85a5ed)
2012-03-22 09:09:57 +11:00
Ronnie Sahlberg
ef2bd0b016 When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance.
(This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2)
2012-02-28 06:56:04 +11:00
Ronnie Sahlberg
91c9371f2d Make KILLTCP structure a child of VNN so that it is freed at the same time
the referenced VNN structure is.

Also, remove the circular reference between the two objects KIPPCTP and VNN

(This used to be ctdb commit 02b62482164a3c69715949074feb7f191a29d534)
2012-02-27 07:21:26 +11:00
Volker Lendecke
5e3b13a32a FreeBSD does not define s6_addr32, only s6_addr
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit d657af4fb68ce3f7c462856f2934f6bf169e120b)
2012-02-13 16:20:12 +01:00
Martin Schwenke
3ae8273d86 Make some ctdb_takeover.c functions static
These were intentionally not static so they could be linked to in unit
test programs.  However, using the CCAN-style unit tests where
relevant code is just included, this is no longer necessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d0e9e8554614bd49ffb9ec3509feaa0e80d0f65d)
2011-11-11 14:41:47 +11:00
Ronnie Sahlberg
8db9b73920 Merge remote branch 'martins/lcp2fix'
(This used to be ctdb commit 7c02d242af552aa732f5c70ea4eeefbc8a8542e2)
2011-11-08 14:06:30 +11:00
Ronnie Sahlberg
0f92fa224c RB_TREE: Add mechanism to abort a traverse
This patch changes the callback signature for traversal
functions to allow a client to abort a traverse before it finishes.
Updates to all callers and examples as well as rb-test tool.

(This used to be ctdb commit 8ab0c63ad36cfbbb1e5fed46a1f4c47b1fdb581f)
2011-11-08 13:40:28 +11:00
Martin Schwenke
c0939af571 LCP IP allocation algorithm - try harder to find a candidate source node
There's a bug in LCP2.  Selecting the node with the highest imbalance
doesn't always work.  Some nodes can have a high imbalance metric
because they have a lot of IPs.  However, these nodes can be part of a
group that is perfectly balanced.  Nodes in another group with less
IPs might actually be imbalanced.

Instead of just trying the source node with the highest imbalance this
tries them in descending order of imbalance until it finds one where
an IP can be moved to another node.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 574091d5aced5e87aefad52f8bc47aa75c25fbf6)
2011-11-02 10:17:00 +11:00
Martin Schwenke
98c27f973d LCP IP allocation algorithm - new function lcp2_failback_candidate()
There's a bug in LCP2.  Selecting the node with the highest imbalance
doesn't always work.  Some nodes can have a high imbalance metric
because they have a lot of IPs.  However, these nodes can be part of a
group that is perfectly balanced.  Nodes in another group with less
IPs might actually be imbalanced.

Factor out the code from lcp2_failback() that actually takes a node
and decides which address should be moved to which node.

This is the first step in fixing the above bug.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 75718c5768b5bb5c0bcd7dd90e0327c6ed22a63d)
2011-11-01 21:01:25 +11:00
Ronnie Sahlberg
d79596ba1a One of the entry points to release an ip reset the pnn field before invoking the eventscript.
this triggered a check for "only run the eventscript if we host the address" to trigger and shortcir=cuit calling the eventscript.

An effect of this would be that 'ctdb delip' would remove the ip from ctdb, but fail to delete it from the interface.

S1028798

(This used to be ctdb commit b82524f240bf21769dd7624ca6026763d38b9396)
2011-09-22 15:17:23 +10:00
Ronnie Sahlberg
4587bdb052 when checking that the interfaces exist in ctdb_add_public_address()
cant talloc off vnn since it is not yet initialized and might not always be NULL

(This used to be ctdb commit 3d37be3e2bfb61ede824028aeebaa18ba304faae)
2011-09-21 11:42:19 +10:00
Ronnie Sahlberg
783ceca07b Interface monitoring: add a event to trigger every 30 seconds to check that all interfaces referenced by the public address list actually exists.
This will make it much easier to root-cause problems such as
S1029023
when an external application deleted the interface while it is still is in use by ctdbd.

(This used to be ctdb commit 9abf9c919a7e6789695490e2c3de56c21b63fa57)
2011-09-06 17:02:19 +10:00
Ronnie Sahlberg
64378fea58 Check interfaces: when reading the public addresses file to create the vnn list
check that the actual interface exist, print error and fail startup if the interface does not exist.

(This used to be ctdb commit cd33bbe6454b7b0316bdfffbd06c67b29779e873)
2011-09-06 16:11:00 +10:00
Volker Lendecke
1cf1670f0a Fix a const warning
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit e25559087c9752502580875f7e33f3c416c05f84)
2011-08-22 17:11:07 +02:00
Ronnie Sahlberg
fea64f65b5 Remove a log message about setting linkstate for an unknown interface.
sometimes we do want to try to set the linkstate for interfaces that are not in use by public addresses right now (but posisbly by other mechanisms) and these messages just spam the logs

S1026357

(This used to be ctdb commit f2fe0a090a9650910ebe49514b3ca01dc593bea3)
2011-08-05 10:05:12 +10:00
Martin Schwenke
5ac67504ca Tests: Initial test code for LCP2 IP allocation algorithm.
Move struct ctdb_public_ip_list to ctdb_private.h and put some
definitions for some functions from ctdb_takeover.c there.  This
allows those functions to be called from unit tests.

Add ctdb_takeover_tests.c and the Makefile support to build it.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9d34be0233edf3bc022345c0494c4b2a4d7f8480)
2011-07-29 09:01:36 +10:00
Martin Schwenke
ff1a81c872 IP allocation - add LCP2 algorithm.
The current non-deterministic IP allocation algorithm balances IPs
across the whole cluster.  It does not consider different
interfaces/VLANs/subnets, so these different groups of IPs aren't
generally well balanced.

This adds the LCP2 algorithm for IP allocation and allows it to be
enabled by setting the "LCP2PublicIPs" tunable to 1.

The LCP2 algorithm calculates the imbalance of a node by totalling the
squares of the distances between each IP on the node.  The IP distance
is defined as the length longest common prefix (LCP) of bits that is
found when comparing 2 IPs.  The imbalance of a cluster is the maximum
imbalance for any node.  At each step the algorithm selects an
allocation to the IP/node combination that results in the choosing the
allocation that best reduces the imbalance of the cluster.

The implementation splits out the IP allocation part of
ctdb_takeover_run() into new function ctdb_takeover_run_core(), and
then extracts out the basic IP assignment code into new functions
basic_allocate_unassigned() and basic_failback().  3 new functions
lcp2_init(), lcp2_allocate_unassigned() and lcp2_failback() implement
the LCP2 algorithm, and are hooked into ctdb_takeover_run_core().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 61fc7fbd0235469df22deb6581c6bd47e30bc0be)
2011-07-29 09:01:17 +10:00
Ronnie Sahlberg
e707f23596 Update the delip command
Dont talloc_free(vnn) immediately but postphone it until later when
the eventscript callback has completed.

CQ S1026664

(This used to be ctdb commit 0a99e8742a261b1d3a2c8830f5c19ea6c2c47cad)
2011-07-29 08:50:48 +10:00
Ronnie Sahlberg
c93a968619 When trying to re-balance the ip assignment and shuffle ips from
nodes with many addresses to nodes with few addresses,
loop up to num_ips+5 times instead of only 5 times.

When we have very many public ips per node, we might need to loop more than
5 times or else we will exit without reaching optimal balance.

(This used to be ctdb commit aa8114a625a637277561a66c80bdece3c27e9e20)
2011-07-06 13:14:13 +10:00
Ronnie Sahlberg
f84bd3b5f1 Dont call the UPDATE event if both old and new interface is the same.
CQ S1018175

(This used to be ctdb commit 6a74515f0a1e24d97cee3ba05d89133aac7ad2b7)
2011-05-04 13:29:29 +10:00
Ronnie Sahlberg
c04505724a IFACE handling. Assume links are always good on nstartup (they almost always
Simplify the handling of setting the links in the 10.interface eventscript
and remove the optimization to only call setifacelink on state change
to make the code simpler to read.

If a take ip event fails, flag the node as unhealthy.

Add a check to the interface script to check if the interface exists
or if it has been deleted.
So that we can capture and become UNHELTHY if someone deletes an interface
we are using to host public addresses.

(This used to be ctdb commit 4ab63d2a7262aff30d5eced184c294c9c9dd4974)
2011-04-11 07:40:05 +10:00
Ronnie Sahlberg
f82936402f IP reallocation. If a public address is already hosted on the node when we startup, log a warning message but do not cause the recovery to fail.
CQ S1022356

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 89f8169c24da96c1fdd0ac19b8a1e0e1df01a72a)
2011-03-14 13:35:53 +01:00
Ronnie Sahlberg
93bea39391 IPALLOCATION : If the node is held pinned down in "init" state
by external services failing to start, or blocking CTDBD from finishing the startup phase,
we can encounter a situation where we have not yet fully initialized, but a
remote recovery master tries to release a certain ip clusterwide.

In this situation the node that is pinned down in init/startup phase
would fail to perform the release of the ip address since we are not yet fully operational and not yet host any valid interfaces.

In this situation, we just need to remain unhealthy, there is on need to
also ban the node.

Remove the autobanning for this condition and just let the node remain in
unhealthy mode.
Banning is overkill in this situation when the system is broken and just
draws attention to ctdbd instead of the root cause.

(This used to be ctdb commit d8af74e4c4961deb94c18dde8ba7fc07e944729c)
2011-01-13 09:42:01 +11:00
Ronnie Sahlberg
a9a6ae064d When assigning the single-public-ip during startup,
flag the interface as initially being "link ok"
so that we can add it and startup.

The eventscript can later drop the flag if required

(This used to be ctdb commit 720849b756c825fb8b285f09972a8c39f1888a99)
2010-12-13 14:24:04 +11:00
Ronnie Sahlberg
c2c53db49d during ip allocation, there are failure modes where a node might hold a ip address
but thinks it is still unassigned (-1).

add code to the recovery daemon to detect this case and trigger a reallocation
so that the ip gets covered

and change the takeip code to allow for this condition, taking on an ip address that is
already hosted.

cq s1021073

(This used to be ctdb commit 9020baf27cab7821c9094cda185206fb7af0fee7)
2010-12-03 13:30:39 +11:00
Ronnie Sahlberg
dbcf0de18c Dont exit the update ip function if the old and new interfaces are the same
since if they are the same for whatever reason this triggers the system
to go into an infinite loop and is unrobust

The scriptds have been changed instead to be able to cope with this
situation for enhanced robustness

During takeover_run and when merging all ip allocations across the cluster
try to kepe track of when and which node currently hosts an ip address
so that we avoid extra ip failovers between nodes

(This used to be ctdb commit cf778b5aaf6356401e3985acccc7df9e08ab6930)
2010-11-10 14:55:25 +11:00
Ronnie Sahlberg
6fa8e1fddb when we load the public address file, at the same time check if we are already hosting the public address, if so, set ourselves up as the pnn for that address
(This used to be ctdb commit 0f2a2dac91a61be188c3578c8bb89d47cbf9a0f8)
2010-11-10 14:55:24 +11:00
Ronnie Sahlberg
5f76f3c0e2 Add a new tunable : DisableIPFailover that when set to non 0
will stopp any ip reallocations at all from happening.

(This used to be ctdb commit d8d37493478a26c5f1809a5f3df89ffd6e149281)
2010-11-10 14:55:24 +11:00
Ronnie Sahlberg
87a0ece976 when creating/adding a public ip, set the initial interface to be the first interface specified
(This used to be ctdb commit 4308935ba48ac7a29e7523315acf580019715f0f)
2010-11-10 14:55:23 +11:00
Ronnie Sahlberg
d8d8b9e1d7 add a new serverid to send a message everytime an ip address is taken on the local node
(This used to be ctdb commit 1261f3d9702800a4e59550c881350daf479f00ef)
2010-09-13 15:43:19 +10:00
Ronnie Sahlberg
19211f99c8 remove an unused variable
(This used to be ctdb commit e07fdbaf12bbe84370bc47a1979fe198a06a6cc8)
2010-09-13 13:13:12 +10:00
Ronnie Sahlberg
c95f4258d8 Add a new event "ipreallocated"
This is called everytime a reallocation is performed.

    While STARTRECOVERY/RECOVERED events are only called when
    we do ipreallocation as part of a full database/cluster recovery,
    this new event can be used to trigger on when we just do a light
    failover due to a node becomming unhealthy.

    I.e. situations where we do a failover but we do not perform a full
    cluster recovery.

    Use this to trigger for natgw so we select a new natgw master node
    when failover happens and not just when cluster rebuilds happen.

(This used to be ctdb commit 7f4c591388adae20e98984001385cba26598ec67)
2010-08-30 18:09:30 +10:00
Ronnie Sahlberg
2e8aac6689 Merge commit 'rusty/ports-from-1.0.112' into foo
(This used to be ctdb commit 13e58d92f5f1723e850a82ae030d0ca57e89b1ee)
2010-08-19 13:17:56 +10:00
Ronnie Sahlberg
5aa5f3e7bf Remove the structure ctdb_control_tcp_vnn since this is identical to the structure ctdb_tcp_connection.
Add a new "ctdb deltickle" command to delete tickles from the database.
This can ONLY be used for tickles created by "ctdb addtickle".

Push any "addtickle/deltickle" updates to other nodes every TickleUpdateInterval seconds'

(This used to be ctdb commit acded034e2f0dcae4c2c9e54e16a001caf23caec)
2010-08-18 12:36:03 +10:00
Rusty Russell
1a009aff73 takeover: prevent crash by avoiding free in traverse on RST timeout
After 5 attempts to send a RST to a client without any response, we free
"con"; this is done during a traverse.  This frees the node we are walking
through (the node is made a child of "con" down in rb_tree.c's
trbt_create_node() (Valgrind would catch this, as Martin confirmed).

So, we create a temporary parent and reparent onto that; then we free
that parent after the traverse, thus deleting the unwanted nodes.

CQ:S1019041
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 08f7f85477610a4916c1ec866aa467b28f1bbec3)
2010-08-18 11:40:17 +09:30
Rusty Russell
f93440c4b7 event: Update events to latest Samba version 0.9.8
In Samba this is now called "tevent", and while we use the backwards
compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now
a separate tevent_fd_set_auto_close() function.

This is based on Samba version 7f29f817fa.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)
2010-08-18 09:16:31 +09:30
Ronnie Sahlberg
4136f27145 When adding an ip at runtime, it might not yet have an iface assigned to it, so ensure that the next takover_ip call will fall through to accept the ip and add it.
(This used to be ctdb commit 2d60f96680d16c2992e2a35517822f88c12538b7)
2010-06-01 16:22:48 +10:00
Ronnie Sahlberg
92340e4d6f check if vnn is a valid pointer before dereferencing it
based on rustys patch for bz62783

(This used to be ctdb commit bdd250b9afdd1060cfd1e2b0f0a5a567150bb380)
2010-05-26 13:43:28 +10:00
Ronnie Sahlberg
4a43428440 The recent change to the recovery daemon to keep track of and
verify that all nodes agree on the most recent ip address assignments
broke "ctdb moveip ..." since that call would never trigger
a full takeover run and thus would immediately trigger an inconsistency.

Add a new message to the recovery daemon where we can tell the recovery daemon to update its assignments.

BZ62782

(This used to be ctdb commit e7069082e5f0380dcddee247db8754218ce18cab)
2010-05-03 15:47:17 +10:00
Ronnie Sahlberg
c3c7aa934f Make create_merged_ip_list() a static function since
it is not called from outside of ctdb_takeover.c

(This used to be ctdb commit 880896a27adfdd5173b2810b6b2f3889802046f0)
2010-05-03 15:47:06 +10:00
Ronnie Sahlberg
79fac9771d In the log message when we have found an inconsistent ip address allocation,
add extra log information about what the inconsistency is.

(This used to be ctdb commit d2e4a9912c4bd13eb4f12681adebe7e59a6d1fb2)
2010-05-03 15:46:36 +10:00
Ronnie Sahlberg
06885ea9a7 In the recovery daemon, keep track of which node we have assigned public ip
addresses and verify that the remote nodes have/keep a consistent view of
assigned addresses.

If a remote node has an inconsistent view of addresses visavi the recovery
master this will trigger a full ip reallocation.

(This used to be ctdb commit f3bf2ab61f8dbbc806ec23a68a87aaedd458e712)
2010-04-08 14:25:26 +10:00
Ronnie Sahlberg
7f2f7364ad lower the loglevel for a debug message for redundant releases of public ips
(This used to be ctdb commit cfc1a4f878b61c85063af649d2339431e799647d)
2010-02-16 11:01:09 +11:00
Stefan Metzmacher
76cb4ce34c server: ban ourself if the ctdb and kernel knowledge of a public ip differs
metze

(This used to be ctdb commit 48e0af91113d6cead6cae3f28d8d8f610cacaa71)
2010-01-20 11:11:04 +01:00
Stefan Metzmacher
405368eeb0 server: give an error if we're getting an takeover_ip event with a wrong pnn
metze

(This used to be ctdb commit 2f44d6f3d290cc1b37b19ec34edfbad12cc0c0a7)
2010-01-20 11:11:04 +01:00
Stefan Metzmacher
a5ba5c129a server: return an error if we get an takeover ip event and we cannot serve the ip
metze

(This used to be ctdb commit f5c221e6abc118aefa489aa7e07755af952fd2bb)
2010-01-20 11:11:03 +01:00
Stefan Metzmacher
55d824bd77 server: print node number as signed integer on release ip event
metze

(This used to be ctdb commit 6c456face30606641f6b8beaad3121c9b05ca763)
2010-01-20 11:11:03 +01:00
Stefan Metzmacher
c5e579b56a server: debug redundant takeover ip events with level INFO
metze

(This used to be ctdb commit 7bc9969c4c28f2c4a4848bd730db3c63bb9204fe)
2010-01-20 11:11:03 +01:00
Stefan Metzmacher
ffdf32dedf server: be less verbose on redundant release_ip events
metze

(This used to be ctdb commit 72ef5f891f85ce51f5ca7e0c03d0c7cc955be110)
2010-01-20 11:11:03 +01:00
Stefan Metzmacher
58d7c44b1c server: add a ctdb_do_updateip()
metze

(This used to be ctdb commit eded224368dded2264e53546c196b1b485cb2094)
2010-01-20 11:11:02 +01:00
Stefan Metzmacher
aa485b17bb server: split out a ctdb_do_takeover_ip() function
metze

(This used to be ctdb commit 8fd6f4aab0c173b4c9c4c02c546e7d2ec1a98423)
2010-01-20 11:11:02 +01:00
Stefan Metzmacher
da59e0b162 server: split out a ctdb_announce_vnn_iface() function
metze

(This used to be ctdb commit ec87a51660cfa8a6851923f757fed31f7ffc7153)
2010-01-20 11:11:02 +01:00
Stefan Metzmacher
179c098e86 server: start with disabled interfaces and let the event scripts enable the interfaces explicit
This makes sure that we don't get public addresses assigned during the
initial recovery and remove them again in the startup event.

metze

(This used to be ctdb commit f872e8c63a2f8979e6a0d088630575bdd4d7b4f1)
2010-01-20 11:11:01 +01:00
Stefan Metzmacher
f4f72024fe server: implement ctdb_control_set_iface_link()
This only marks the interface status and doesn't
generate any directly triggered action.

The actions is later taken by the recovery process
in verify_ip_allocation.

metze

(This used to be ctdb commit cff58b27c970e9252d131125941c372019fd6660)
2010-01-20 11:10:59 +01:00
Stefan Metzmacher
0dd7e1bfa1 server: implement ctdb_control_get_ifaces()
metze

(This used to be ctdb commit 0e982a416a126d9856145c19baef320cd0e71d66)
2010-01-20 11:10:59 +01:00
Stefan Metzmacher
80e3ab04de server: implement ctdb_control_get_public_ip_info()
metze

(This used to be ctdb commit 486fbd15f4cc4f45a4c110b2ddbba48bade22c9f)
2010-01-20 11:10:59 +01:00
Stefan Metzmacher
32d00d0a0d controls: add stups for GET_PUBLIC_IP_INFO, GET_IFACES and SET_IFACE_LINK_STATE
metze

(This used to be ctdb commit a2c9e4578e149eccb2c6183f64a6b657eb95c5e1)
2010-01-20 11:10:59 +01:00
Stefan Metzmacher
37880b0d0a server: use CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE during a takeover run
We know ask for the known and available interfaces.
This means a node gets a RELEASE_IP event for all interfaces
it "knows", but doesn't serve and a node only gets a TAKE_IP event
for "available" interfaces.

metze

(This used to be ctdb commit a695a38e49e7c3e15a9706392dc920eeab1f11ba)
2010-01-20 11:10:59 +01:00
Stefan Metzmacher
d89604afab server: implement CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE behavior
metze

(This used to be ctdb commit 09a5c59bc8d1301edf60d7ae77504dc6d11a7da2)
2010-01-20 11:10:58 +01:00
Stefan Metzmacher
bea53c60b8 server: keep the interface information in a list of ctdb_iface structures
metze

(This used to be ctdb commit ff5291778f0752e176539397e9530dcf0e546bea)
2010-01-20 11:10:58 +01:00
Stefan Metzmacher
539ebdc94c server: we don't need to copy strings we pass as talloc_asprintf() arguments
metze

(This used to be ctdb commit 080ba5ac2195fb73ef6f18740abdde57a7b97151)
2010-01-20 11:10:58 +01:00
Stefan Metzmacher
a1da4e05b5 server: allow multiple interfaces comma separated in public_addresses
metze

(This used to be ctdb commit 33a00ef7233051acdbc66410130ec5d876a8422f)
2010-01-20 11:10:58 +01:00
Stefan Metzmacher
8d50eda2b1 server: add a ctdb_vnn_iface_string() helper function to access vnn->iface
metze

(This used to be ctdb commit 9e5532e215892b2e0aadd9b106a730727f92c62e)
2010-01-20 11:10:58 +01:00
Stefan Metzmacher
bec35e6441 server: add a ctdb_set_single_public_ip() helper function
metze

(This used to be ctdb commit 400b4806c4a9686a2ee6398b5d7c3e0ca0793fd1)
2010-01-20 11:10:57 +01:00
Rusty Russell
928b8dcb31 eventscript: handle banning within the callbacks
Currently the timeout handler in eventscript.c does the banning if a
timeout happens.  However, because monitor events are different, it has
to special case them.

As we call the callback anyway in this case, we should make that handle
-ETIME as it sees fit: for everyone but the monitor event, we simply ban
ourselves.  The more complicated monitor event banning logic is now in
ctdb_monitor.c where it belongs.

Note: I wrapped the other bans in "if (status == -ETIME)", though they
should probably ban themselves on any error.  This change should be a
noop.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 9ecee127e19a9e7cae114a66f3514ee7a75276c5)
2009-12-07 23:48:57 +10:30
Ronnie Sahlberg
569001afd0 Merge commit 'martins/status-test-2'
Conflicts:

	server/eventscript.c

(This used to be ctdb commit e9b3477a5b9a2eff18f727e7d59338bfb5214793)
2009-12-01 10:53:18 +11:00
Martin Schwenke
a64ccf07c1 Add flag to ctdb_event_script_callback indicating when called by client.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a1d654a982ca56fade82552f4e6b5586236d3233)
2009-11-26 15:49:49 +11:00
Ronnie Sahlberg
926261aafc use a binary tree and sort all ipv4/v6 addresses before we assign them out on nodes.
(This used to be ctdb commit 862526e558099fad4c8259cb88da9b776aa7f80d)
2009-11-25 11:54:40 +11:00
Rusty Russell
2d9254404d eventscript: introduce enum for different event script calls.
Rather than doing strcmp everywhere, pass an explicit enum around.  This
also subtly documents what options are available.  The "options" arg
is now used for extra arguments only.

Unfortunately, gcc complains on empty format strings, so we make
ctdb_event_script() take no varargs, and add ctdb_event_script_args().  We
leave ctdb_event_script_callback() taking varargs, which means callers
have to do "%s", "".

For the moment, we have CTDB_EVENT_UNKNOWN for handling forced scripts
from the ctdb tool.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 8001488be4f2beb25e943fe01b2afc2e8779930d)
2009-11-24 11:16:49 +10:30
Rusty Russell
2763df22de eventscript: put timeout inside ctdb_event_script_callback_v
Everyone uses the same timeout value, so just remove it from the API.
If we ever need variable timeouts, that might as well be central too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 533c3e053293941d2a9484b495e78d45f478bb08)
2009-11-24 11:09:46 +10:30
Ronnie Sahlberg
e07ca41886 change the eventscript handling to allow EventScriptTimeout for each individual script isntead of for the entire set of scripts
restructure the talloc hierarchy to allow this

(This used to be ctdb commit 64da4402c6ad485f1d0a604878a7b0c01a0ea5f0)
2009-10-28 16:11:54 +11:00
Ronnie Sahlberg
902c476c03 From Volker L
Fix some warnings  and an incorrect check for a talloc failure

(This used to be ctdb commit 27296a47b3d057a6729287acf128b2b67775ecde)
2009-10-22 12:19:40 +11:00
Ronnie Sahlberg
50712d48d3 change some loglevels and also pront the pnn of the ip for takeip/releaseip logging
(This used to be ctdb commit 9d95dfbd12898975ba0d8560d95a974210d3de7c)
2009-10-06 11:40:38 +11:00
Ronnie Sahlberg
3133dadd8f allocate takeoverip state as a child of vnn and also make the takeocerip context a child of vnn
(This used to be ctdb commit 804e5905be51f43c8a338bfbe216fd8d5718850f)
2009-10-06 09:35:15 +11:00
Ronnie Sahlberg
263d76f8c2 lower the loglevel for the info messages that a public ip is not hosted locally for takeip/releaseip
(This used to be ctdb commit f76132b0d555e52ee0a379ec2c156350b37b0280)
2009-09-04 04:09:30 +10:00
Ronnie Sahlberg
1593e67399 send ARPs with an interval of 1.1 seconds during ip takeover.
this is to better handle linux clients which often default to ignore grat arps that arrive within 1 second of eachother.

(This used to be ctdb commit 5664da36943b4901a807a9594b0f45e859aafbf3)
2009-07-07 11:40:01 +10:00
Ronnie Sahlberg
b046f5e3aa when adding an ip, try manually adding and takingover the ip instead of triggering a full recovery to do the same thing
(This used to be ctdb commit 4d5d22e64270cfb31be6acd71f4f97ec43df5b2c)
2009-06-05 17:00:47 +10:00
Ronnie Sahlberg
e6170b5389 add a new node state : DELETED.
This is used to mark nodes as being DELETED internally in ctdb
so that nodes are not renumbered if / when they are removed from the nodes file.

This is used to be able to do "ctdb reloadnodes" at runtime without
causing nodes to be renumbered.
To do this, instead of deleting a node from the nodes file, just comment it out like

   1.0.0.1
   #1.0.0.2
   1.0.0.3

After removing 1.0.0.2 from the cluster,  the remaining nodes retain their
pnn's from prior to the deletion, namely 0 and 2

Any line in the nodes file that is commented out represents a DELETED pnn

(This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343)
2009-06-01 14:18:34 +10:00
Sumit Bose
2fcedf6dac add missing checks on so far ignored return values
Most of these were found during a review by Jim Meyering <meyering@redhat.com>

(This used to be ctdb commit 3aee5ee1deb4a19be3bd3a4ce3abbe09de763344)
2009-05-21 11:22:21 +10:00
Ronnie Sahlberg
9a3e19658d Change the loglevel of "registered tcp client for ..." to INFO
instead of ERR

(This used to be ctdb commit 92b5580c38c23b99c1692708540983b0c0fcd6cf)
2009-05-19 08:55:42 +10:00
Michael Adam
3cca0f75e4 Fix treatment of link local ipv6 addresses: set the scope id.
metze / Michael

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 9d12de1ca6107801dada927729e755c0949d73bf)
2009-01-19 22:50:53 +01:00
Stefan Metzmacher
23b550d6fc Fix segfault in ip takeover fallback code.
metze

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 3b88f3dec5227e8579672974f7028fb356ee1d94)
2009-01-16 07:22:59 +11:00
root
321866dbba finish the ipv6 support.
allow clients to register either ipv4 or ipv6 client connections to the tickles list

(This used to be ctdb commit d9b44d7c3255b0fd7359b9afeb613e6ff4c4eaac)
2009-01-13 16:17:20 +11:00
Ronnie Sahlberg
b9bd20ce55 add a context and a timed event so that once we have been in recovery
mode for too long we drop all public ip addresses

(This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0)
2008-10-22 11:04:41 +11:00
Ronnie Sahlberg
233b0e5cbb lower the loglevel for the informational message that a TCP_ADD opeation
described an ip address not known to be a public address.

This could happen if someone for genuine reasons accesses a share
through a static ip address.
It can also happen if non homogenous public address configurations are
used and when a tcp description is pushed out to a different node that
does not server/know the specific ip address.

(This used to be ctdb commit 9b1d089c99413f3681440f3cf33c293d118c9108)
2008-10-15 03:02:09 +11:00
Ronnie Sahlberg
cb300382b0 update TAKEIP/RELEASEIP/GETPUBLICIP/GETNODEMAP controls so we retain an
older ipv4-only version of these controls.

We need this so that we are backwardcompatible with old versions of ctdb
and so that we can interoperate with a ipv4-only recmaster during a
rolling upgrade.

(This used to be ctdb commit 6b76c520f97127099bd9fbaa0fa7af1c61947fb7)
2008-10-14 10:40:29 +11:00
Ronnie Sahlberg
3411e98e14 skip empty lines in the public addresses file, not skip all non-empty
lines

(This used to be ctdb commit dc108adada33bb713f71a2859eda3b439ed0cd1a)
2008-10-07 19:34:34 +11:00
Ronnie Sahlberg
374906860c from Michael Adams : allow #-style comments in the nodes and public
addresses file

(This used to be ctdb commit 5f96b33a379c80ed8a39de1ee41f254cf48733f9)
2008-10-07 19:25:10 +11:00
Ronnie Sahlberg
70c7525a02 zero out the address structure to keep valgrind happy
(This used to be ctdb commit 8060e591b0eb2d184b5a7444487477225d2e1dbf)
2008-08-29 12:26:02 +10:00
Ronnie Sahlberg
a35fa0aa8f rename ctdb_tcp_client back to the original name ctdb_control_tcp
(This used to be ctdb commit 4d1c0418cfe6170bc081684dbe45908a5d285f0b)
2008-08-27 10:24:35 +10:00
Ronnie Sahlberg
eb23d7b6d4 we must canonicalize the sockaddr structures in killtcp so that we do the necessary downgrade if required
(This used to be ctdb commit 2f8b33948e395228cbac3450c0c684e49069abf0)
2008-08-20 12:02:54 +10:00
Ronnie Sahlberg
ef997d344f initial ipv6 patch
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>

(This used to be ctdb commit 1f131f21386f428bbbbb29098d56c2f64596583b)
2008-08-19 14:58:29 +10:00
Andrew Tridgell
1431210d46 fixed send of release IP message
(This used to be ctdb commit db6bc3745a56cc12e60e727190a098a6527690d6)
2008-08-08 22:06:39 +10:00
Andrew Tridgell
b3bcb42774 fixed a warning
(This used to be ctdb commit 015cd221c3c62eaa3cd0351fb8e93292c7c293aa)
2008-07-04 17:04:37 +10:00
Ronnie Sahlberg
d8433cacb2 first cut to convert takeover_callback_state{}
to use ctdb_sock_addr instead of sockaddr_in

(This used to be ctdb commit 5444ebd0815e335a75ef4857546e23f490a22338)
2008-06-04 17:12:57 +10:00
Ronnie Sahlberg
598fba7fad fix a comment
note that we dont actually send the ipv6 "gratious arp" on the wire just yet.
(since ipv6 doesnt use arp)
but all the infrastructure is there when we implement sending raw neig.disc. packets

(This used to be ctdb commit b87fab857bc9b3537527be93b7f68484502d6b84)
2008-06-04 15:23:06 +10:00
Ronnie Sahlberg
7d39ac131b convert handling of gratious arps and their controls and helpers to
use the ctdb_sock_addr structure so tehy work for both ipv4 and ipv6

(This used to be ctdb commit 86d6f53512d358ff68b58dac737ffa7576c3cce6)
2008-06-04 15:13:00 +10:00
Ronnie Sahlberg
92a0c0fc13 lowe the loglevel for the warning that releaseip was called for a non-public address.
the address might be a public address on a different node so no need to fiull up the logs with thoise messages

(This used to be ctdb commit c8181476748395fe6ec5284c49e9d37b882d15ea)
2008-05-21 11:50:41 +10:00
Ronnie Sahlberg
9c23bf7776 lower the loglevel for when we have "tickles" for an ip address that is not a public address on the local node (it may be a public address on other nodes)
(This used to be ctdb commit 1360c2f08a463f288b344d02025e84113743026d)
2008-05-21 11:44:50 +10:00
Ronnie Sahlberg
f4fd4d0af8 dont disable/enable monitoring for each eventscript, instead
just disable the monitoring during the "startrecovery" event and enable it again once recovery has completed

(This used to be ctdb commit 68029894f80804c9f31fc90ed0c1b58f75812c3d)
2008-05-16 08:20:40 +10:00
Ronnie Sahlberg
909ff219e0 Start implementing support for ipv6.
This enhances the framework for sending tcp tickles to be able to send ipv6 tickles as well.

Since we can not use one single RAW socket to send both handcrafted ipv4 and ipv6 packets, instead of always opening TWO sockets, one ipv4 and one ipv6 we get rid of the helper ctdb_sys_open_sending_socket() and just open (and close)  a raw socket of the appropriate type inside ctdb_sys_send_tcp().
We know which type of socket v4/v6 to use based on the sin_family of the destination address.

Since ctdb_sys_send_tcp() opens its own socket  we no longer nede to pass a socket
descriptor as a parameter.  Get rid of this redundant parameter and fixup all callers.

(This used to be ctdb commit 406a2a1e364cf71eb15e5aeec3b87c62f825da92)
2008-05-14 15:47:47 +10:00
Ronnie Sahlberg
0d7b34c9e5 Add two new controls to add/delete public ip address from a node at runtime.
The controls only modify the runtime setting of which public addresses a node
can server and does not modify /etc/ctdb/public_addresses.
To make the change permanent you also need to edit /etc/ctdb/public_addresses
manually.

After ip addresses have been added/deleted you need to invoke a recovery
for the ip addresses to be redistributed.

(This used to be ctdb commit f8294d103fdd8a720d0b0c337d3973c7fdf76b5c)
2008-03-27 09:23:27 +11:00
Ronnie Sahlberg
e19264ea26 change the log level for the message when someone connects to a non-public ip
(This used to be ctdb commit bc9c4f0d52e9b06aceb08cea99ed3fd20b44616c)
2008-03-13 07:54:55 +11:00
Ronnie Sahlberg
a89ed0fdc2 add a new tunable 'NoIPFailback'
when this tunable is set, ip addresses will only be failed over when a node
fails. And only those ip addresses held by the failed node will be reallocated
in the cluster.

When a node becomes active again, this will not lead to any failback of ip addresses.

This can reduce the number of "ip address movements" in the cluster since we dont automatically fail an ip address back, but can also lead to an unbalanced cluster since we no longer attempt to spread the ip addresses out evenly across the active nodes.

This tuneable can NOT be active at the same time as DeterministicIPs are used.

(This used to be ctdb commit d3b8a461b15bc584fa1785eb5922de6d49d8f6c4)
2008-03-03 12:52:16 +11:00
Ronnie Sahlberg
e08519b74d when we reallocate the ip addresses for nodes, we must make sure that
a node that has been allocated to server an ip actually CAN serve that ip
(if we use differing public_addresses files on each node)

(This used to be ctdb commit fdaf7cb2d7682507fbf4c6c2b833b327c93fac08)
2008-03-03 10:53:23 +11:00
Andrew Tridgell
f6e53f433b merge from ronnie
(This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)
2008-02-04 20:07:15 +11:00
Andrew Tridgell
9d6ac0cf55 added debug constants to allow for better mapping to syslog levels
(This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502)
2008-02-04 17:44:24 +11:00
Andrew Tridgell
146d4b0db7 merge async recovery changes from Ronnie
(This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2)
2008-01-29 13:59:28 +11:00
Andrew Tridgell
e9987cf236 fixed a warning
(This used to be ctdb commit f34d0f9351c1cda3327efb14e173f249f7854570)
2008-01-05 09:30:49 +11:00
Andrew Tridgell
7edb41692e merge from ronnie
(This used to be ctdb commit 6653a0b67381310236e548e5fc0a9e27209b44e0)
2007-12-03 10:19:24 +11:00
Ronnie Sahlberg
50573c5391 add ctdb_disable/enable_monitoring() that only modifies the monitoring
flag.
change calling of the recovered/takeip/releaseip event scripts to use 
these enable/disable functions instead of stopping/starting monitoring.

when we disable monitoring we want all events to still be running
in particular the events to monitor for dead nodes  and we only want to 
supress running the monitor event scripts

(This used to be ctdb commit a006dcc4f75aba950dd701ad7d1a84e89df285e8)
2007-11-30 10:09:54 +11:00
Ronnie Sahlberg
8ac8cce487 dont manipulate ctdb->monitoring_mode directly from the SET_MON_MODE
control, instead call ctdb_start/stop_monitoring()

ctdb_stop_monitoring() dont allocate a new monitoring context, leave it 
NULL. Also set the monitoring_mode in this function so that 
ctdb_stop/start_monitoring() and ->monitoring_mode are kept in sync.
Add a debug message to log that we have stopped monitoring.

ctdb_start_monitoring()  check whether monitoring is already active and 
make the function idempotent.
Create the monitoring context when monitoring is started.
Update ->monitoring_mode once the monitoring has been started.
Add a debug message to log that we have started monitoring.

When we temporarily stop monitoring while running an event script,
restart monitoring after the event script wrapper returns instead of in 
the event script callback.

Let monitoring_mode start out as DISABLED and let it be enabled once we call ctdb_start_monitoring.

dont check for MONITORING_DISABLED in check_fore_dead_nodes(). If 
monitoring is disabled, this event handler will not be called.

(This used to be ctdb commit 3a93ae8bdcffb1adbd6243844f3058fc742f76aa)
2007-11-30 08:44:34 +11:00
Ronnie Sahlberg
3c1f9882a8 revert 773
(This used to be ctdb commit 5a1c8f458ddc9b0ff532afda6007e32db10a71c8)
2007-11-12 10:23:35 +11:00
Ronnie Sahlberg
df5dd43e7c add a new tunable "CheckNodesFile" that when set to 0 will disable the
check in the recovery daemon that all nodes are using the same 
/etc/ctdb/nodes file.

Also add some more missing checks that the pnn used is a valid pnn 
before using it to dereferencing the ctdb->nodes array


This is useful since it allows us to add more physical nodes to a an 
existing cluster without having to bring down the entire cluster.

The to add an additional node to an existing cluster would then be
1, on all nodes set CheckNodesFile=0 using 'ctdb setvar'
2, on all nodes add CTDB_SET_CheckNodesFile=0 to /etc/sysconfig/ctdb
For each each node, one at a time :
3, use 'ctdb disable' to stop the hosted services
4, service ctdb stop
5, service ctdb start
Once all nodes have been restarted 
6, on all nodes remove CTDB_SET_CheckNodesFile=0 from 
/etc/sysconfig/ctdb
7, on all nodes set CheckNodesFile=0 using 'ctdb setvar'

8, configure and start up the new node

During this procedure, only one node at a time was brought 
down/restarted and was so only for a short period.

(This used to be ctdb commit 462501a32143e943ce350bd904a47c0955414a51)
2007-11-05 13:36:11 +11:00
Ronnie Sahlberg
056aac6e0c add a new tunable : DeterministicIPs that makes the allocation of
public addresses to nodes deterministic.

Activate it by adding CTDB_SET_DeterministicIPs=1 in /etc/sysconfig/ctdb

When this is set,    the first entry in /etc/ctdb/public_addresses will 
always be hosted by node 0, when that node is available, the second 
entry by node1 and so on.

This tunable allows the allocation of addresses to become very 
unbalanced and is only for debugging/testing use.
Beware, this feature requires that /etc/ctdb/public_addresses are 
identical on all the nodes in the cluster.

(This used to be ctdb commit f0ca221f235731542090d8a6c86f2b7cd2ce2f96)
2007-10-16 12:15:02 +10:00
Ronnie Sahlberg
bdd67bba1e add a --single-public-ip argument to ctdbd to specify the ip address
used in single public ip address mode.
when using this argument, --public-interface must also be used.

add a vnn structure to the ctdb context to describe the single public ip 
address


update the killtcp control in the daemon that if a socketpair that is to 
be killed does not match a normal public address it checks if the 
destination address maches the single public ip address and if so uses 
that vnn structure from the ctdb context


this allows killtcp to kill also connections to the single public ip 
instead of only normal public addresses

(This used to be ctdb commit 5661ba17b91f62821dec1c76056c78b99752a90b)
2007-10-10 09:42:32 +10:00
Ronnie Sahlberg
7735957693 remove some debug outputs
(This used to be ctdb commit f29c0b52df1f455909ba133e3ad3bc462dc32929)
2007-10-09 13:45:42 +10:00
Ronnie Sahlberg
80cd82f8e4 add a control to send gratious arps from the ctdb daemon
(This used to be ctdb commit 563819dd1acb344f95aabb4bad990b36f7ea4520)
2007-10-09 11:56:09 +10:00
Andrew Tridgell
30de14fe79 force recovery if unable to tell a node to release an IP
(This used to be ctdb commit 6895788d2499344a03357e5c1103cb8383e9eaf7)
2007-09-13 11:19:49 +10:00
Andrew Tridgell
3c0f61cb92 we don't need the is_loopback logic in ctdb any more
(This used to be ctdb commit 4ecf29ade0099c7180932288191de9840c8d90a9)
2007-09-13 10:45:06 +10:00
Andrew Tridgell
67bd64ef35 - don't allow the registration of clients with IPs we don't hold
- change some debug levels to make tracking of IP release problems easier
(This used to be ctdb commit 5f9aed62adaf87750f953412c55b29c58e4bb6c0)
2007-09-12 13:22:31 +10:00
Andrew Tridgell
5b65a6c7f0 get interface right
(This used to be ctdb commit e0edc38d7e897f7de2850eb2cfd17fea75c16fcc)
2007-09-10 20:45:27 +10:00
Andrew Tridgell
f3ae1cdb02 - use struct sockaddr_in more consistently instead of string addresses
- allow for public_address lines with a defaulting interface

(This used to be ctdb commit 29cb760f76e639a0f2ce1d553645a9dc26ee09e5)
2007-09-10 14:27:29 +10:00
Andrew Tridgell
42168177ef merge from ronnie
(This used to be ctdb commit 1f21d4d563232926c35d03c4d69eb69190823dc6)
2007-09-10 13:21:11 +10:00
Ronnie Sahlberg
4ac749bfa4 change the signature to ctdb_sys_have_ip() to also return:
a bool that specifies whether the ip was held by a loopback adaptor or 
not
 the name of the interface where the ip was held

when we release an ip address from an interface, move the ip address 
over to the loopback interface

when we release an ip address  after we have move it onto loopback, 
use 60.nfs to kill off the server side (the local part) of the tcp 
connection   so that the tcp connections dont survive a 
failover/failback

61.nfstickle,   since we kill hte tcp connections when we release an ip 
address   we no longer need to restart the nfs service in 61.nfstickle

update ctdb_takeover to use the new signature for ctdb_sys_have_ip

when we add a tcp connection to kill in ctdb_killtcp_add_connection()
check if either the srouce or destination address match a known public 
address

(This used to be ctdb commit f9fd2a4719c50f6b8e01d0a1b3a74b76b52ecaf3)
2007-09-10 07:20:44 +10:00
Ronnie Sahlberg
e4eeceaf3a dont dereference vnn before we have assigned it a pointer value
(This used to be ctdb commit 2a8fc69aea8527b22a3fe57427677e4caff57338)
2007-09-05 14:29:44 +10:00
Ronnie Sahlberg
77ec4d5248 allow different nodes in the cluster to use different public_addresses
files
so that we can partition the cluster into different subsets of nodes 
which each serve a different subset of the public addresses

(This used to be ctdb commit 889e0fe69e4c88c6166282b12843b8d9727552d6)
2007-09-04 23:15:23 +10:00
Ronnie Sahlberg
8f819c6a0e get rid of the ctdb_vnn_list structure and just use a single list of
ctdb_vnn

(This used to be ctdb commit 7b9fd06321af17043136b1420b57284450ae7ba5)
2007-09-04 18:20:29 +10:00
Ronnie Sahlberg
cf45c5096c we cant have takeover_ctx hanging off ctdb since it is freed/recreated
everytime we release an ip.
this context is used to hold all resources needed when sending out 
gratious arps and tcp tickles during ip takeover.

we hang it off the vnn structure that manages that particular ip address 
instead   so that we can have multiple ones going in parallell

this bug (or the same bug in different shape) has probably been in ctdb 
for very very long   but is likely to be hard to trigger

(This used to be ctdb commit c58db1cadaba253b2659573673b28c235ef7db76)
2007-09-04 14:36:52 +10:00
Ronnie Sahlberg
3e6be59f61 fix typo in debug output
(This used to be ctdb commit 011a777c6e538ca79f104c7884a4f0e222997382)
2007-09-04 14:21:35 +10:00
Ronnie Sahlberg
784eac9079 dont just always return 0 from the killtcp control.
return 0 or -1 so that the ctdb tool knows whether the control succeeded 
or not

(This used to be ctdb commit cace8b40090be5529ec6b463d3839d0e22f4039d)
2007-09-04 14:19:18 +10:00
Ronnie Sahlberg
eb4cf6a686 change ctdb->vnn to ctdb->pnn
(This used to be ctdb commit 8c776e5707e503ec6586aae39ac6b3ea5a2fd2bc)
2007-09-04 10:06:36 +10:00
Ronnie Sahlberg
12ebb74838 change how we do public addresses and takeover so that we can have
multiple public addresses spread across multiple interfaces on each 
node.

this is a massive patch since we have previously made the assumtion that 
we only have one public address per node.

get rid of the public_interface argument.  the public addresses file 
now explicitely lists which interface the address belongs to

(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
2007-09-04 09:50:07 +10:00
Andrew Tridgell
7f630b67f6 fixed segv when no public interface is set
(This used to be ctdb commit 55b415f87bd3cba13c73ccd2fe661720754a6af7)
2007-08-27 11:49:42 +10:00
Ronnie Sahlberg
7e1f840c8d if a public address has already been taken over by a node, then let that
public address remain at that node until either the node becomes 
unhealthy or the original/primary node for that address becomes healthy 
again.


Othervise what will happen is 
1, if we ban a node,   the banning code immediately does a 
takeover_run() and reassigns the public address to a different node in 
the cluster.
2, a few seconds later (at most) the recovery daemon will detect that 
the number of nodes has shrunk and will initiate a recovery.
During the recovery  the public address would again be assigned to a 
node, this time a different node.

(This used to be ctdb commit 30a6b7a648e22873d8ce6289a3d6dc42c4b9e3b3)
2007-08-20 14:16:58 +10:00
Ronnie Sahlberg
3b9d50f3ee change the now rather small /etc/ctdb/events script into a service
specific script /etc/ctdb/events.d/00.ctdb

get rid of CTDB_EVENTS_SCRIPT and --event-script

(This used to be ctdb commit 81ccfaf838e5772d4a58eb6a70224b7b39aba9f3)
2007-08-15 15:01:31 +10:00
Ronnie Sahlberg
4023576e50 call the service specific event scripts directly from the forked child
instead for from /etc/ctdb/events so that we can get better debugging 
output in the logs when something fails in the scripts

(This used to be ctdb commit 4ed96b768aea1611e8002f7095d3c4d12ccf77a3)
2007-08-15 14:44:03 +10:00
Ronnie Sahlberg
56d5ef27b6 add a wrapper function to create the key used to insert/lookup a certain
tcp connection in the tree that stores the tcp connections to kill by 
sending an RST

add a define that specified the keylength instead of hardcoding it as 4

(This used to be ctdb commit 6a8322cbae10f2c78b2e286c75aeb25ece12ea7f)
2007-08-15 10:01:00 +10:00
Ronnie Sahlberg
adb49f02f0 change the mem hierarchy for trees. let the node be owned by the data
we store in the tree and use a node destructor so that when the data is 
talloc_free()d we also remove the node from the tree.

(This used to be ctdb commit b8dabd1811ebd85ee031563e95085f720a2fa04d)
2007-08-09 14:08:59 +10:00
Ronnie Sahlberg
9c216d0d76 when we want to kill a tcp connection we stored the connection
description (src + dst sockaddr_in) in a linked list.
everytime we receive a captured packet from the network we had to walk 
this list in linear time to see if the packet matched a connection we 
wanted to RST.
which wouldnt scale very well.


replace the linked list with a redblack tree that is indexed by
src address, src port,  dst address,   dst port
to make checking whether the packet belongs to a connection we want to 
RST very fast and scalable


the reason we need to capture packets when we want to kill a TCP 
connection is because we must wait for an ACK coming back from the 
remote host  so that we can learn which sequence number to use in the 
RST.
Most tcp today will ingore any and all RST segments unless the 
sequencenumber lies exactly on the right edge of the window to make 
spoofing RST a little bit more difficult.

(This used to be ctdb commit ced18caea8582af042287beb6333dd1f8ba3344d)
2007-08-08 15:09:19 +10:00
Ronnie Sahlberg
203306400e add helpers to traverse a tree where the key is an array of uint32
(This used to be ctdb commit d328c66827cafff6356e96df2a782930274fe139)
2007-08-08 13:50:18 +10:00
Ronnie Sahlberg
dd14afe6aa after we have checked dest address that it is a public address
update addr to the source address so the rpintout in the log matches
the client that attached to samba

(This used to be ctdb commit 72098b71c79469c86769ca82bbd484c81902d27c)
2007-07-30 16:10:14 +10:00
Ronnie Sahlberg
e666808f60 no need to have a separate assignment of the tcparray pointer followed
by a talloc_steal()
use the returned pointer in talloc_steal as the value to assign

(This used to be ctdb commit 5c6375ad3bbecfa725ec3b1477f259e5a8191866)
2007-07-25 08:03:58 +10:00
Ronnie Sahlberg
81294825e7 when we build the arp structure for sending gratious arp (and tcp
tickles) just talloc_steal the enture tcp_array into the arp 
structure instead of copying each of the entries into a linked list
and then releasing the tcparray.

(This used to be ctdb commit 468e237740cf37a65872ef700bbb1284ede8352a)
2007-07-24 07:46:51 +10:00
Ronnie Sahlberg
ea56d1d20e set the tcp tickle update flag to true once we have done a takeover and
tickled all connections
othervise the other nodes will still remember this list until next time 
we have had a connection/client closing.

(This used to be ctdb commit cb8e5d4bbee2f14f498735489f673ff3679dfd9d)
2007-07-20 19:11:45 +10:00
Ronnie Sahlberg
81767b2a7b when a client connects with TCP_CLIENT we should look at the
destination address to find the public address   not the source address

(This used to be ctdb commit d6d4a7f38a52c1c2579a54d14cb7a6981fb42f5b)
2007-07-20 17:04:08 +10:00
Ronnie Sahlberg
fca90ce3c3 updated ctdb tickle management
there is an array for each node/public address that contains tcp tickles

we send a TCP_ADD as a broadcast to all nodes when a client is added

if tcp tickles are removed, they are only removed immediately from the 
local node.
once every 20 seconds a node will push/broadcast out the tickle list for 
all public addresses it manages.   this will remove any deleted tickles 
from the remote nodes

(This used to be ctdb commit e3c432a915222e1392d91835bc7a73a96ab61ac9)
2007-07-20 15:05:55 +10:00
Ronnie Sahlberg
7b17afdfcd change the tickle list from one global list into an array per public
ip/node

once we have started sending all tickles for a specific ip   delete the 
entire array   so that the tickles dont remain forever in the ctdb 
server

add a control to send the full list of every tickle that is registered 
for a particular public ip/node

(This used to be ctdb commit d0eee33e44d3f8e26debbec21d41e2cbdbb520e6)
2007-07-20 10:06:41 +10:00
Andrew Tridgell
394190d3cc - log registering of tcp clients
- don\'t remove a tcp entry if we do not own the ip
(This used to be ctdb commit 400aa284b9785ce6409e7600df429f5849e3867d)
2007-07-19 15:04:54 +10:00
Andrew Tridgell
fb22d3bd2c merged from ronnie
(This used to be ctdb commit 765b07fa5d1af07c8c7212d19d8e9574060b3039)
2007-07-18 20:13:57 +10:00
Ronnie Sahlberg
4d1f3acc94 add a check if start_node is beyond the end of the nodemap and reset it
back to 0 if it is to prevent an infinite loop.

this could happen if in the future we add a mechanism to add/remove 
nodes to a cluster at runtime

(This used to be ctdb commit 217e80a468713fec86ccb0608460e3401046bb98)
2007-07-16 08:36:09 +10:00
Ronnie Sahlberg
49f98e79fd change the way we pick/find a new node to takeover for a failed node
to keep a static that controls at which noide to start searching the 
list for takeover candidates next time we need to find a node.

each time we find a node to takeover, reset the start variable to point 
to the next node in the list

this makes the distribution of takeover nodes much more even

(This used to be ctdb commit e9800df5a21079ea478d16f7dd2fd4707de85650)
2007-07-16 08:28:44 +10:00
Ronnie Sahlberg
f09566a81a add a private_data field to the killtcp structure and let the system
specific routines populate it as it see fit when creating a 
capture socket.
pass this structure to read_tcp and close capture socket as parameter

(This used to be ctdb commit 79bbfcfb2223889126fe307d5bbfd24917da07ee)
2007-07-13 17:07:10 +10:00
Andrew Tridgell
8f637e6317 ensure killtcp structure is initialised
(This used to be ctdb commit 2fe7d1ce87e55e125411e7406a9e00b8f55e3cb7)
2007-07-13 11:55:58 +10:00
Andrew Tridgell
1e14ecd176 - merge from ronnie
- cleaner handling of system capture socket

(This used to be ctdb commit d194a41a71b8466d0726dcbae3970a86386fcb3c)
2007-07-13 11:31:18 +10:00
Ronnie Sahlberg
a650497680 as an optimization for when we want to send multiple tickles at a time
let the caller create the sending socket and use a single socket instead 
of one new one for each tickle.
pass a sending socket to ctdb_sys_send_tcp()

ctdb_sys_kill_tcp is not longer used so remove it

set the socketflags for close on exec and nonblocking in the helper that 
creates the sockets instead of in the caller

add a helper to create a sending socket to send tickles from

(This used to be ctdb commit 469f3fb238a0674a2b48fdf1a7e657e32428178a)
2007-07-12 09:22:06 +10:00
Ronnie Sahlberg
823b7d4a5f rename killtcp->fd to killtcp->capture_fd
we might want to have two sockets attached to the killtcp structure
one for capturing and a second one for sending  so we dont have to 
create a new socket for each tickle we want to send

(This used to be ctdb commit b3e82ec38047bbec1edfd88ade264077d4cbd2ee)
2007-07-12 08:52:24 +10:00
Ronnie Sahlberg
76ab80104a make the ctdb tool use the killtcp control in the daemon instead of
calling killtcp directly

(This used to be ctdb commit d21e3e9cf11bdcba6234302e033d6549c557dd69)
2007-07-12 08:30:04 +10:00
Ronnie Sahlberg
1ed0c3a9f7 add daemon code for the new kill_tcp control
(This used to be ctdb commit 8fe4ae62255ecb2db36bea736ff17409ba6614c5)
2007-07-11 18:24:25 +10:00
Ronnie Sahlberg
e4db03f7e6 add a ctdb_ prefix to two public functions
(This used to be ctdb commit 32adee5426aa75ddcd4d648ef326ed03d5ff5c46)
2007-07-11 18:13:03 +10:00
Ronnie Sahlberg
aa080f66d9 first cut at a better and more scalable socketkiller
that can kill multiple connections asynchronously using one listening 
socket

(This used to be ctdb commit 22bb44f3d745aa354becd75d30774992f6c40b3a)
2007-07-11 17:43:51 +10:00
Ronnie Sahlberg
0c44e0ad46 add a ctdb_kill_tcp_callback() that will perform a kill tcp using a
background process

(This used to be ctdb commit dcfcaacff56347d94c244512eb72219b05ef9c3d)
2007-07-11 12:33:14 +10:00
Andrew Tridgell
32de198fd3 update lib/replace from samba4
(This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)
2007-07-10 15:29:31 +10:00
Andrew Tridgell
f1db15ffe1 fixed sense of inet_aton test
(This used to be ctdb commit ed5cf9b43c49312d3736e85077863d23990acce8)
2007-07-08 21:09:09 +10:00
Andrew Tridgell
056d3c35a4 call kill_clients when releasing all IPs, as well as for individual IPs
(This used to be ctdb commit ad68904720eb69757601589b06726190321685ac)
2007-07-08 20:45:12 +10:00
Andrew Tridgell
af5ee9981e we do tell banned nodes to release IPs
(This used to be ctdb commit 381dc0421d4d825398c03dcff4e79e3f76c3c981)
2007-07-08 20:24:03 +10:00
Andrew Tridgell
bdf01ed7c0 - neaten up the command line for killtcp
- split out the event script code into a separate module
- get rid of the separate takeover directory

(This used to be ctdb commit 8ea2c923a3e2464200ff79bf2c3f1f89e6a93ad4)
2007-07-04 16:51:13 +10:00