IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12157
CID 1364703: Resource leak (RESOURCE_LEAK)
However, this would already be fixed by the fix for CID 1125618, so
this is probably just a minor bug fix.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12151
This was already dropped in commit d678684695.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Aug 17 09:22:13 CEST 2016 on sn-devel-144
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12152
This makes the behaviour of "ctdb addip" similar to "ctdb delip".
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Sat Aug 13 00:55:02 CEST 2016 on sn-devel-144
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12121
If the connection list is empty then process_clist_send() still
creates a request. However, since no subrequests are created for
controls sent, tevent_req_poll() waits forever for an event.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
New tool breaks some of the tool unit tests due to improved error
messages. Those changes are in the following patch.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Removing and adding replacement code makes the commits cleaner.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Print the sequence number without preamble. Print it in hex to match
the logs.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
No preamble, just print keyword ENABLED or DISABLED. Fix the
documentation to reflect this and remove the text that is simply
wrong.
Also remove output from "ctdb enablemonitor" and "ctdb disablemonitor"
on success. This is just noise.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Print just the debug level as a description, for both human and
machine readable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This message was useful when debugging during development but it isn't
generally useful.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Databases should never be thawed manually. A database recovery will
correctly thaw all databases. Otherwise there is a bug in the database
recovery.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
SC2006: Use $(..) instead of legacy `..`.
Make the obvious changes to $(...) but convert some to $((...)).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2124: Assigning an array to a string!
Assign as array, or use * instead of @ to concatenate.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2059: Don't use variables in the printf format string.
Use printf "..%s.." "$foo".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC1004: You don't break lines with \ in single quotes, it results in
literal backslash-linefeed.
These don't hurt, since awk can cope with the continuations. However,
they don't add anything.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2046: Quote this to prevent word splitting.
SC2086: Double quote to prevent globbing and word splitting.
Add some quoting where it makes sense. Use shellcheck directives for
false-positives.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2034: VAR appears unused. Verify it or export it.
Drop some variables that are unnecessarily used. Use shellcheck
directive for false-positives.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
==27786== Syscall param write(buf) points to uninitialised byte(s)
==27786== at 0x62820D0: __write_nocancel (syscall-template.S:84)
==27786== by 0x428B57: ctdb_queue_send (ctdb_io.c:322)
==27786== by 0x41F3B1: ctdb_client_queue_pkt (ctdb_client.c:153)
==27786== by 0x41F3B1: ctdb_client_send_message (ctdb_client.c:603)
==27786== by 0x419FA3: srvid_broadcast.constprop.26 (ctdb.c:1965)
==27786== by 0x41B869: control_reload_nodes_file (ctdb.c:5696)
==27786== by 0x404DBA: main (ctdb.c:6008)
==27786== Address 0x7ead310 is 144 bytes inside a block of size 168 alloc'd
==27786== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27786== by 0x564DBEC: __talloc_with_prefix (talloc.c:675)
==27786== by 0x564DBEC: __talloc (talloc.c:716)
==27786== by 0x564DBEC: _talloc_named_const (talloc.c:873)
==27786== by 0x564DBEC: _talloc_zero (talloc.c:2318)
==27786== by 0x41E1E2: _ctdbd_allocate_pkt (ctdb_client.c:59)
==27786== by 0x41F37D: ctdb_client_send_message (ctdb_client.c:594)
==27786== by 0x419FA3: srvid_broadcast.constprop.26 (ctdb.c:1965)
==27786== by 0x41B869: control_reload_nodes_file (ctdb.c:5696)
==27786== by 0x404DBA: main (ctdb.c:6008)
==27786==
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
The recovery lock can not be reliably updated at run-time. If it
fails to update on some nodes then split-brain protection is gone and
there is no reasonable way to repair the situation. CTDB will have to
be restarted on all nodes. So, if this feature is being used to avoid
scheduling an outage then an outage will have to be scheduled just in
case!
To update the recovery lock, shut down CTDB on all nodes, reconfigure
the recovery lock and start CTDB again.
Those that *really* want to be able to change the recovery lock at
run-time can still do so. Set CTDB_RECOVERY_LOCK to point to a script
and this script can then be modified at run-time. However, please
don't report bugs if bad things happen...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Sat Apr 30 04:28:13 CEST 2016 on sn-devel-144
If the reclock is set then print it, otherwise print nothing.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
LVS and NAT gateway support had bit-rotted. We don't use any of these
in scripts/tests and we very much doubt anyone else uses them.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Apr 25 10:34:47 CEST 2016 on sn-devel-144
This can list the different aspects of status: master, list, status.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Tweak "ctdb natgw natgwlist" to keep output format the same.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Why allocate all that memory and transfer all that data across the
socket?
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
For "master", if there is a master then print the PNN, otherwise print
nothing.
For "list", print the PNN and IP addresses without a colon in between.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This simply calls out to the wrapper, so that commands are changed as
follows:
ctdb lvsmaster -> ctdb lvs master
ctdb lvs -> ctdb lvs list
This provides a simple, extensible interface and means that "ctdb lvs
status" is also available.
Unit tests are streamlined so that there is a single test for each
CTDB state. Each test does "master", "list" and "status" sub-tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This will replace the ctdb CLI tool "lvs" and "lvsmaster" options. It
also makes LVS daemon support unnecessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Testing indicates that these are good reliable defaults that can kill
many connections in a reasonable amount of time.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Apr 1 08:10:54 CEST 2016 on sn-devel-144
This made sense when connections were individually queued in the
daemon. However, they're now done in batch so just keep an overall
count.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
When previously killing TCP connections via the daemon there was some
latency due to each kill being sent to the daemon via a separate
control. This probably meant that when doing a 2-way kill the tickle
ACKs sent to the client end of a connection would not interfere with
listening for the reply ACK from the server end. Now that there is no
latency, the tickle ACK or RST sent to the client end can be seen as
the reply to the server end tickle ACK, and vice-versa.
To avoid this, throw away packets that look like we sent them.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The end of the connection in parentheses is not the end being killed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Since they're being done in batch, just schedule an event to traverse
all the connections.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The handler won't be called unless there is something to read.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This will allow killing of TCP connections without daemon involvement.
It looks strange that the common code for daemon and helper is in the
server directory. Having it in the server directory means less
temporary changes to the build configuration. This code will move
into the helper itself and will no longer be used by the daemon.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This was a workaround for trying to ensure public IP addresses are
properly rebalanced after running "ctdb addip" on multiple nodes.
"ctdb reloadips" is a better solution.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is undocumented and is not needed. It was a workaround for
trying to ensure public IP addresses are properly rebalanced after
running "ctdb addip" on multiple nodes. "ctdb reloadips" is a better
solution.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
These have been scattered around the code so that
tevent_loop_allow_nesting() can be called. However, only the main
daemon and some tests currently use nested event loops.
TEVENT_DEPRECATED is already defined in the places where it is needed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Feb 26 07:11:29 CET 2016 on sn-devel-144
The "natgwlist" command is no longer marked "auto all" and is also
marked "without daemon". That latter is not strictly true because
ctdb_natgw needs the daemon so a subsequent invocation of "ctdb
nodestatus" will work. However, "without daemon" is used here because
the top-level "ctdb natgwlist" does not need to open a connection to
the daemon. It just needs to invoke ctdb_natgw.
Update tests to suit.
It would make sense to make "ctdb natgw" generally call out to
ctdb_natgw, passing all argument. However, that can be done later.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is intended to replace the use of "ctdb natgwlist" in 11.natgw
and provide different views of the NAT gateway status.
It replaces the use of CTDB_NATGW_SLAVE_ONLY=yes with a "slave-only"
keyword in the NAT gateway nodes file. This means the nodes file must
be consistent on all nodes in a NAT gateway group.
Note that this script is not yet integrated, so there are no behaviour
or documentation changes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is to avoid clash with samba structure server_id.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This groups function prototypes for common client/server functions in
common/common.h and removes them from ctdb_private.h.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
When building standalone ctdb from git repo, samba_version_file correctly
includes git sha in VERSION string. When building standalone ctdb from
tarball, samba_version_file puts UNKNOWN in the VERSION string.
Use the packaged include/ctdb_version.h file to set the correct git sha.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Instead of includes.h, include the required header files explicitly.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This groups function prototypes for system specific functions in
common/system.h and removes them from ctdb_private.h.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
The databases are repacked automatically during vacuuming when the
freelist size grows beyond configured threshold (RepackLimit).
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
The same structure is required in new controls for database transactions.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
In this case: ctdbd_wrapper, onnode, ctdb_diagnostics, ctdb.sudoers.
Set sensible defaults from configure options.
Update documentation to match, trying to fix up anything that has been
missed before.
The onnode unit tests need a symlink to the functions file.
The simple integration tests need to set CTDB_BASE and also
need symlinks to functions/nodes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
fixup
Signed-off-by: Martin Schwenke <martin@meltin.net>
This hasn't existed for a long time.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Just use ctdb_tcp_connection. It is the same. There are no external
users.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
The timed out error is ignored for certain events (start_recovery,
recoverd, takeip, releaseip). If these events time out, then the debug
hung script outputs the following:
3 scripts were executed last releaseip cycle
00.ctdb Status:OK Duration:4.381 Thu Jul 16 23:45:24 2015
01.reclock Status:OK Duration:13.422 Thu Jul 16 23:45:28 2015
10.external Status:DISABLED
10.interface Status:OK Duration:-1437083142.208 Thu Jul 16 23:45:42 2015
The endtime for timed out scripts is not set. Since the status is not
returned as -ETIME for some events, ctdb scriptstatus prints -ve duration.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This follows the same pattern as the tstore command, and it allows
specifying key strings with a trailing \0 character.
Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Mon Jul 6 23:23:22 CEST 2015 on sn-devel-104
Memory allocated by ctdb_sys_find_ifname is not
freed by the caller.
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Michael Adam <obnox@samba.org>
A recovery is not required: when deleting a node it should already be
disconnected and when adding a node it will also be disconnected. The
new sanity checks in "reloadnodes" ensure that these assumptions are
met.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If a recovery occurs when some nodes have reloaded and others haven't
then the nodemaps with be inconsistent so bad things will happen.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The code was too "clever". The 4 different cases should be separate.
The "node remains deleted" case doesn't need the IP address comparison
(always 0.0.0.0) or the disconnected check.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There is no reason to serialise these or even handle remote nodes
first. Using a broadcast is more efficient and is less code.
Update expected test results to reflect changed order of messages.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Mar 23 15:04:00 CET 2015 on sn-devel-104
"ctdb reloadnodes" currently does no sanity checking of the nodes
file. This can cause chaos if a line is deleted from the nodes file
rather than commented out. It also repeatedly produces a spurious
warning for each deleted node, even if the node was deleted a long
time ago.
Instead compare the nodemap with the contents of the local nodes file
to sanity check before attempting any reloads. Note that this is
still imperfect if the nodes files are inconsistent across nodes but
it is better. Also ensure that any nodes that are to be deleted are
already disconnected. Avoid trying to talk to deleted nodes.
The current implementation is a bit unfortunate when it comes to
deleting nodes. The most obvious alternative to the above complexity
would be to reloadnodes on the specified node first, then fetch the
node map (in which newly deleted nodes would be marked as such) and
then handle the remote nodes. However, the implementation of
reloadnodes is asynchronous and it only actions the reload after 1
second. This is presumably to avoid the recovery master noticing the
inconsistency between nodemaps and triggering a recovery before all
nodes have had their nodemaps updated.
Note that this recovery can still occur if the check is done at an
inconvenient time. A better long term approach might be to quiesce
the recovery master checks while reloadnodes is in progress.
Update a unit test to reflect the change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This compares the nodes file on the current node with that on all
nodes. If any are different then do not reload nodes.
If any nodes files can't be fetched then do not reload nodes. This
could be because some nodes are running an older version without this
feature. This is unsupported: why make a major cluster
reconfiguration while a cluster is half upgraded?
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It should not be possible to specify "-n <othernode>", unless
<othernode> is the current node. To support this, add new function
assert_current_node_only().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
In the CTDB CLI tool source code and the documentation example.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To support this, update printm() to replace ':' in format string with
options.machineseparator, which is a string but must contain a single
character.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
printm() is a printf(3) replacement and must be used to printing any
machine readable output. It currently just calls vprintf(3). Later
it will change the field delimiter.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Put declarations into ctdb_logging.h, factor out some common code,
clean up #includes.
Remove the check so see if the 1st character of the debug level is
'-'. This is wrong, since it is trying to check for a negative
numeric debug level (which is no longer supported) and would need to
be handled in the else anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Found by address sanitizer.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Oct 17 12:56:02 CEST 2014 on sn-devel-104
As far as we know, nobody uses this and it just complicates the
logging subsystem.
Remove all ringbuffer code and documentation. Update the local
daemons startup code correspondingly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
This makes it consistent with Samba, to ease transition.
Update unit test code to link to with tdb_wrap instead of including
db_wrap.c.
There are some potential whitespace fixes in this commit that have
been ignored. CTDB's lib/tdb_wrap will be deleted after the
transition to Samba's lib/tdb_wrap, so there's no point polishing it
too much.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This function is only used in this file. Samba's lib/util doesn't
have timeval_delta(), so staging a clean transition.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is part of a migration to Samba's lib/util. CTDB always passes 0
(i.e. no max_size) so use a simple assert() to enforce this, rather
than changing a lot of code that will be discarded anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This effectively reverts commit 442953c540.
The correct way of telling recovery daemon to trigger a database recovery is
by setting recovery mode to active. There is no need to freeze databases as
recovery master will do that across the cluster anyway.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Recent changes have caused these commands to attempt to get
capabilities from all nodes before doing further filtering. This
means that capabilities are unnecessarily fetched from nodes that are
unlikely to be the master. If such a node does not answer the control
then many nodes can fail to calculate the master node. In the case of
natgwlist this will cause "monitor" events to fail resulting in
unhealthy nodes.
Restore the behaviour where capabilities are only fetched for a node
that will be the master if it has the desired flags.
Although this masks a problem where a connected node is not replying,
it can help to avoid an outage in some cases.
Add supporting tests and infrastructure. Infrastructure just lets a
timeout be faked - just for ctdb_ctrl_getcapabilities_stub() so far.
First test checks that this infrastructure works if the first node
times out in natgwlist. Second test checks the case worked around by
the above fix - that is, no failure when a node with PNN beyond the
NATGW master can time out.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu May 29 05:59:37 CEST 2014 on sn-devel-104
script_status->num_scripts is used as the count in this message:
"%d scripts were executed last %s cycle\n"
However, script_status->num_scripts includes disabled scripts, which
are never actually executed.
Instead, count the number of scripts that aren't disabled and make the
message print that.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed May 28 02:27:48 CEST 2014 on sn-devel-104
Now freeing ctdb_db context will close the tdb database. So make sure
all the locks are released (by freeing record handles or memory context
from which record handles are allocated) before freeing ctdb_db context.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
This makes sure that AllowClientDBAttach is set to 0 before detaching any
databases.
If someone enables the tunable between checking of tunable and actual
detaching of databases, then they deserve what they get. :-)
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
Commit ba69742ccd missed the point of
filtering disconnected nodes while limiting the nodemap to those in
the NAT gateway group. It was really to avoid trying to fetch
capabilities from disconnected nodes. This should be explicitly done
in filter_nodemap_by_capabilities(), otherwise "ctdb natgwlist" simply
fails when there is a disconnected node.
Note that the alternate solution where filter_nodemap_by_flags() is
called before filter_nodemap_by_capabilities() would not be not
correct. Filtering on flags first can produce a "healthier" set of
nodes where none of them have the NAT gateway capability.
Also extend stub for ctdb_ctrl_getcapabilities() to fail when trying
to get capabilities from a disconnected node and add a corresponding
test to confirm that "ctdb natgwlist" is no longer broken.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This way this logic is centralised. It also means that the IP address
comparisons in the NAT gateway code are IPv6 safe.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There is a check for empty lines in the loop just below.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Factor it from read_nodes_file(). Use it there and in
read_natgw_nodes_file().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Add another filter function, like the ones for capabilities and flags
to, for filtering by NAT gateway nodes. This makes the main
natgw_list function more readable.
Note that this drops the early filtering of disconnected nodes, so
they will now be listed in a NAT gateway group. This makes sense.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The index of the nodes array in nodemap isn't the PNN.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Instead of just finding the first node that doesn't have any flags in
flag_mask set, change it into a function that filters a nodemap to
exclude nodes with the given flags.
This makes the NATGW code simpler but also provides a function that
can be used in other code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Check capabilities once to build a filtered node list instead of
repeatedly checking capabilities
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It looks like the original without_daemon code still tried to
establish a client connection to the daemon. Closing stderr looks to
be a cheap way of hiding the errors when this failed.
However, later cleanups avoid the client connection altogether, so do
not close stderr. Now debug output from without_daemon commands
actually appears.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Jan 31 07:52:46 CET 2014 on sn-devel-104
If a node isn't numeric then it is silently converted to 0.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
This fixes an alignment discrepancy on 32-bit vs 64-bit platforms.
sizeof(struct ctdb_ltdb_header) = 20 (32-bit)
= 24 (64-bit)
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
This is naive and assumes no performance problems when updating
persistent DBs. It also does no error handling.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
Also add test.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
This can be useful for piping data to onnode in certain circumstances.
There are now also enough command-line options that they should
definitely be alphabetically ordered.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
When running a mixed version cluster, compatibility with older
versions was was broken during recent refactorisation.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>