IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The same structure is required in new controls for database transactions.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
In this case: ctdbd_wrapper, onnode, ctdb_diagnostics, ctdb.sudoers.
Set sensible defaults from configure options.
Update documentation to match, trying to fix up anything that has been
missed before.
The onnode unit tests need a symlink to the functions file.
The simple integration tests need to set CTDB_BASE and also
need symlinks to functions/nodes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
fixup
Signed-off-by: Martin Schwenke <martin@meltin.net>
This hasn't existed for a long time.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Just use ctdb_tcp_connection. It is the same. There are no external
users.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
The timed out error is ignored for certain events (start_recovery,
recoverd, takeip, releaseip). If these events time out, then the debug
hung script outputs the following:
3 scripts were executed last releaseip cycle
00.ctdb Status:OK Duration:4.381 Thu Jul 16 23:45:24 2015
01.reclock Status:OK Duration:13.422 Thu Jul 16 23:45:28 2015
10.external Status:DISABLED
10.interface Status:OK Duration:-1437083142.208 Thu Jul 16 23:45:42 2015
The endtime for timed out scripts is not set. Since the status is not
returned as -ETIME for some events, ctdb scriptstatus prints -ve duration.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This follows the same pattern as the tstore command, and it allows
specifying key strings with a trailing \0 character.
Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Mon Jul 6 23:23:22 CEST 2015 on sn-devel-104
Memory allocated by ctdb_sys_find_ifname is not
freed by the caller.
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Michael Adam <obnox@samba.org>
A recovery is not required: when deleting a node it should already be
disconnected and when adding a node it will also be disconnected. The
new sanity checks in "reloadnodes" ensure that these assumptions are
met.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If a recovery occurs when some nodes have reloaded and others haven't
then the nodemaps with be inconsistent so bad things will happen.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The code was too "clever". The 4 different cases should be separate.
The "node remains deleted" case doesn't need the IP address comparison
(always 0.0.0.0) or the disconnected check.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There is no reason to serialise these or even handle remote nodes
first. Using a broadcast is more efficient and is less code.
Update expected test results to reflect changed order of messages.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Mar 23 15:04:00 CET 2015 on sn-devel-104
"ctdb reloadnodes" currently does no sanity checking of the nodes
file. This can cause chaos if a line is deleted from the nodes file
rather than commented out. It also repeatedly produces a spurious
warning for each deleted node, even if the node was deleted a long
time ago.
Instead compare the nodemap with the contents of the local nodes file
to sanity check before attempting any reloads. Note that this is
still imperfect if the nodes files are inconsistent across nodes but
it is better. Also ensure that any nodes that are to be deleted are
already disconnected. Avoid trying to talk to deleted nodes.
The current implementation is a bit unfortunate when it comes to
deleting nodes. The most obvious alternative to the above complexity
would be to reloadnodes on the specified node first, then fetch the
node map (in which newly deleted nodes would be marked as such) and
then handle the remote nodes. However, the implementation of
reloadnodes is asynchronous and it only actions the reload after 1
second. This is presumably to avoid the recovery master noticing the
inconsistency between nodemaps and triggering a recovery before all
nodes have had their nodemaps updated.
Note that this recovery can still occur if the check is done at an
inconvenient time. A better long term approach might be to quiesce
the recovery master checks while reloadnodes is in progress.
Update a unit test to reflect the change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This compares the nodes file on the current node with that on all
nodes. If any are different then do not reload nodes.
If any nodes files can't be fetched then do not reload nodes. This
could be because some nodes are running an older version without this
feature. This is unsupported: why make a major cluster
reconfiguration while a cluster is half upgraded?
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It should not be possible to specify "-n <othernode>", unless
<othernode> is the current node. To support this, add new function
assert_current_node_only().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
In the CTDB CLI tool source code and the documentation example.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To support this, update printm() to replace ':' in format string with
options.machineseparator, which is a string but must contain a single
character.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
printm() is a printf(3) replacement and must be used to printing any
machine readable output. It currently just calls vprintf(3). Later
it will change the field delimiter.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Put declarations into ctdb_logging.h, factor out some common code,
clean up #includes.
Remove the check so see if the 1st character of the debug level is
'-'. This is wrong, since it is trying to check for a negative
numeric debug level (which is no longer supported) and would need to
be handled in the else anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Found by address sanitizer.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Oct 17 12:56:02 CEST 2014 on sn-devel-104
As far as we know, nobody uses this and it just complicates the
logging subsystem.
Remove all ringbuffer code and documentation. Update the local
daemons startup code correspondingly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
This makes it consistent with Samba, to ease transition.
Update unit test code to link to with tdb_wrap instead of including
db_wrap.c.
There are some potential whitespace fixes in this commit that have
been ignored. CTDB's lib/tdb_wrap will be deleted after the
transition to Samba's lib/tdb_wrap, so there's no point polishing it
too much.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This function is only used in this file. Samba's lib/util doesn't
have timeval_delta(), so staging a clean transition.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is part of a migration to Samba's lib/util. CTDB always passes 0
(i.e. no max_size) so use a simple assert() to enforce this, rather
than changing a lot of code that will be discarded anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This effectively reverts commit 442953c540424ad0c64f4264b5ee27c45a3130e8.
The correct way of telling recovery daemon to trigger a database recovery is
by setting recovery mode to active. There is no need to freeze databases as
recovery master will do that across the cluster anyway.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Recent changes have caused these commands to attempt to get
capabilities from all nodes before doing further filtering. This
means that capabilities are unnecessarily fetched from nodes that are
unlikely to be the master. If such a node does not answer the control
then many nodes can fail to calculate the master node. In the case of
natgwlist this will cause "monitor" events to fail resulting in
unhealthy nodes.
Restore the behaviour where capabilities are only fetched for a node
that will be the master if it has the desired flags.
Although this masks a problem where a connected node is not replying,
it can help to avoid an outage in some cases.
Add supporting tests and infrastructure. Infrastructure just lets a
timeout be faked - just for ctdb_ctrl_getcapabilities_stub() so far.
First test checks that this infrastructure works if the first node
times out in natgwlist. Second test checks the case worked around by
the above fix - that is, no failure when a node with PNN beyond the
NATGW master can time out.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu May 29 05:59:37 CEST 2014 on sn-devel-104
script_status->num_scripts is used as the count in this message:
"%d scripts were executed last %s cycle\n"
However, script_status->num_scripts includes disabled scripts, which
are never actually executed.
Instead, count the number of scripts that aren't disabled and make the
message print that.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed May 28 02:27:48 CEST 2014 on sn-devel-104
Now freeing ctdb_db context will close the tdb database. So make sure
all the locks are released (by freeing record handles or memory context
from which record handles are allocated) before freeing ctdb_db context.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>