IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Memory allocated by ctdb_sys_find_ifname is not
freed by the caller.
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Michael Adam <obnox@samba.org>
A recovery is not required: when deleting a node it should already be
disconnected and when adding a node it will also be disconnected. The
new sanity checks in "reloadnodes" ensure that these assumptions are
met.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If a recovery occurs when some nodes have reloaded and others haven't
then the nodemaps with be inconsistent so bad things will happen.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The code was too "clever". The 4 different cases should be separate.
The "node remains deleted" case doesn't need the IP address comparison
(always 0.0.0.0) or the disconnected check.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There is no reason to serialise these or even handle remote nodes
first. Using a broadcast is more efficient and is less code.
Update expected test results to reflect changed order of messages.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Mar 23 15:04:00 CET 2015 on sn-devel-104
"ctdb reloadnodes" currently does no sanity checking of the nodes
file. This can cause chaos if a line is deleted from the nodes file
rather than commented out. It also repeatedly produces a spurious
warning for each deleted node, even if the node was deleted a long
time ago.
Instead compare the nodemap with the contents of the local nodes file
to sanity check before attempting any reloads. Note that this is
still imperfect if the nodes files are inconsistent across nodes but
it is better. Also ensure that any nodes that are to be deleted are
already disconnected. Avoid trying to talk to deleted nodes.
The current implementation is a bit unfortunate when it comes to
deleting nodes. The most obvious alternative to the above complexity
would be to reloadnodes on the specified node first, then fetch the
node map (in which newly deleted nodes would be marked as such) and
then handle the remote nodes. However, the implementation of
reloadnodes is asynchronous and it only actions the reload after 1
second. This is presumably to avoid the recovery master noticing the
inconsistency between nodemaps and triggering a recovery before all
nodes have had their nodemaps updated.
Note that this recovery can still occur if the check is done at an
inconvenient time. A better long term approach might be to quiesce
the recovery master checks while reloadnodes is in progress.
Update a unit test to reflect the change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This compares the nodes file on the current node with that on all
nodes. If any are different then do not reload nodes.
If any nodes files can't be fetched then do not reload nodes. This
could be because some nodes are running an older version without this
feature. This is unsupported: why make a major cluster
reconfiguration while a cluster is half upgraded?
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It should not be possible to specify "-n <othernode>", unless
<othernode> is the current node. To support this, add new function
assert_current_node_only().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
In the CTDB CLI tool source code and the documentation example.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To support this, update printm() to replace ':' in format string with
options.machineseparator, which is a string but must contain a single
character.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
printm() is a printf(3) replacement and must be used to printing any
machine readable output. It currently just calls vprintf(3). Later
it will change the field delimiter.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Put declarations into ctdb_logging.h, factor out some common code,
clean up #includes.
Remove the check so see if the 1st character of the debug level is
'-'. This is wrong, since it is trying to check for a negative
numeric debug level (which is no longer supported) and would need to
be handled in the else anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Found by address sanitizer.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Oct 17 12:56:02 CEST 2014 on sn-devel-104
As far as we know, nobody uses this and it just complicates the
logging subsystem.
Remove all ringbuffer code and documentation. Update the local
daemons startup code correspondingly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
This makes it consistent with Samba, to ease transition.
Update unit test code to link to with tdb_wrap instead of including
db_wrap.c.
There are some potential whitespace fixes in this commit that have
been ignored. CTDB's lib/tdb_wrap will be deleted after the
transition to Samba's lib/tdb_wrap, so there's no point polishing it
too much.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This function is only used in this file. Samba's lib/util doesn't
have timeval_delta(), so staging a clean transition.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is part of a migration to Samba's lib/util. CTDB always passes 0
(i.e. no max_size) so use a simple assert() to enforce this, rather
than changing a lot of code that will be discarded anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This effectively reverts commit 442953c540424ad0c64f4264b5ee27c45a3130e8.
The correct way of telling recovery daemon to trigger a database recovery is
by setting recovery mode to active. There is no need to freeze databases as
recovery master will do that across the cluster anyway.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Recent changes have caused these commands to attempt to get
capabilities from all nodes before doing further filtering. This
means that capabilities are unnecessarily fetched from nodes that are
unlikely to be the master. If such a node does not answer the control
then many nodes can fail to calculate the master node. In the case of
natgwlist this will cause "monitor" events to fail resulting in
unhealthy nodes.
Restore the behaviour where capabilities are only fetched for a node
that will be the master if it has the desired flags.
Although this masks a problem where a connected node is not replying,
it can help to avoid an outage in some cases.
Add supporting tests and infrastructure. Infrastructure just lets a
timeout be faked - just for ctdb_ctrl_getcapabilities_stub() so far.
First test checks that this infrastructure works if the first node
times out in natgwlist. Second test checks the case worked around by
the above fix - that is, no failure when a node with PNN beyond the
NATGW master can time out.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu May 29 05:59:37 CEST 2014 on sn-devel-104
script_status->num_scripts is used as the count in this message:
"%d scripts were executed last %s cycle\n"
However, script_status->num_scripts includes disabled scripts, which
are never actually executed.
Instead, count the number of scripts that aren't disabled and make the
message print that.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed May 28 02:27:48 CEST 2014 on sn-devel-104
Now freeing ctdb_db context will close the tdb database. So make sure
all the locks are released (by freeing record handles or memory context
from which record handles are allocated) before freeing ctdb_db context.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
This makes sure that AllowClientDBAttach is set to 0 before detaching any
databases.
If someone enables the tunable between checking of tunable and actual
detaching of databases, then they deserve what they get. :-)
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
Commit ba69742ccd822562ca2135d2466e09bf1216644b missed the point of
filtering disconnected nodes while limiting the nodemap to those in
the NAT gateway group. It was really to avoid trying to fetch
capabilities from disconnected nodes. This should be explicitly done
in filter_nodemap_by_capabilities(), otherwise "ctdb natgwlist" simply
fails when there is a disconnected node.
Note that the alternate solution where filter_nodemap_by_flags() is
called before filter_nodemap_by_capabilities() would not be not
correct. Filtering on flags first can produce a "healthier" set of
nodes where none of them have the NAT gateway capability.
Also extend stub for ctdb_ctrl_getcapabilities() to fail when trying
to get capabilities from a disconnected node and add a corresponding
test to confirm that "ctdb natgwlist" is no longer broken.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This way this logic is centralised. It also means that the IP address
comparisons in the NAT gateway code are IPv6 safe.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There is a check for empty lines in the loop just below.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Factor it from read_nodes_file(). Use it there and in
read_natgw_nodes_file().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Add another filter function, like the ones for capabilities and flags
to, for filtering by NAT gateway nodes. This makes the main
natgw_list function more readable.
Note that this drops the early filtering of disconnected nodes, so
they will now be listed in a NAT gateway group. This makes sense.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The index of the nodes array in nodemap isn't the PNN.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Instead of just finding the first node that doesn't have any flags in
flag_mask set, change it into a function that filters a nodemap to
exclude nodes with the given flags.
This makes the NATGW code simpler but also provides a function that
can be used in other code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Check capabilities once to build a filtered node list instead of
repeatedly checking capabilities
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>