IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
If the lock file is inaccessible or the inode number changes then the
lock is lost, so exit. This allows the recovery daemon to trigger an
election. The ensuing recovery will re-take the lock.
By default the lock file is checked every 60 seconds. A lot can
happen in 60 seconds but being more aggressive and accessing the lock
too often could result in a performance issue for the cluster
filesystem.
An new optional 2nd argument is added, which is the lock file re-check
time in seconds.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This will allow more conditions to be waited on via additional
sub-requests. At the moment this just completes when the parent wait
completes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Put the checking for the process being immediately re-parented into
the computation too. This will be very rare and doing it
consistently makes testing saner.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There is no need to wait until the parent kills the helper. The
parent will get the initial response, indicating contention or
similar, and will then get a separate event indicating that the pipe
is gone.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes the code more explicit and makes testing easier due to less
dependencies.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
clang warns:
ctdb/server/ctdb_mutex_fcntl_helper.c:61:3: warning: Value stored to 'fd' is never read
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
One of these had a missing space, so this implicitly fixes it. It
also drops the need to unnecessarily include common.h, which comes
with some dependency baggage.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Only do this if the recovery lock is unset. Log every minute for the
first 10 minutes, then every 10 minutes, then every hour.
This is useful for determining whether a split brain occurred. It is
particularly useful if logging failed or was throttled at startup, so
there is no evidence of the split brain when it began.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
In tini, allow_empty_value=false causes the parser to ignore the lines
without '=' sign, but lines with nothing after '=' sign are allowed and
cause empty string ("") to be passed as a value.
This is counter-intuitive, so conf requires special handling for empty
values (which are treated as invalid).
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Regression introduced by commit
2558f96da1. count should be signed
because list_of_connected_nodes() returns -1 on failure. Variable i
is used in both signed and unsigned contexts, so add new signed
variable j for use in signed context.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14017
CTDB should start as a disabled unit (systemd) in most of the
distributions and, when trying to enable it for the first time, user
should get an unconfigured, or similar, error.
Depending on /etc/ctdb/nodes file will give a clear direction to final
user on what is needed in order to get cluster up and running. It should
work like previous ENABLED=NO variables in SySV like initialization
scripts.
Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes it consistent with print-socket.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jul 5 06:19:11 UTC 2019 on sn-devel-184
csbuild doesn't like the hack where variable buf is initialised to
itself to avoid an unused variable warning. buf is unused so remove
it instead.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This can never be NULL. It could probably be NULL in the past when
"all database" locks existed.
There are paths where is is checked for NULL and then later
dereferenced, causing static analysers to produce spurious warnings.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This avoids static analysers continuing analysis after calls to these
functions and producing incorrect warnings.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Otherwise ret == 0 is returned from successful call to
ctdb_int32_pull().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
According to the documentation, sendto() should either send the packet
as given or return with an error. However, given that it can return
the number of bytes sent, treat the theoretical error of a short
packet send separately, since errno would not be set in this case.
Similarly, treat a short packet recv() separately from an error where
errno is set.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This avoids an unnecessary signed/unsigned comparison issue.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
clang reports:
ctdb/protocol/protocol_types.c:5191:3: warning: Value stored to 'ret' is never read
Found by csbuild.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Indexing by PNN is wrong.
This also removes a signed/unsigned comparison because the PNN is not
compared to -1 anymore.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Node ID is a poorly defined concept, indicating the slot in the node
map where the IP address was found. This signed value also ends up
compared to num_nodes, which is unsigned, producing unwanted warnings.
Just return the PNN because this what both callers really want.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Static analysis finds that earlier in the call path, ctdb_string_len()
checks for NULL, so complains that a NULL value can be passed to
strlen() here. Avoid this by adding an assert().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The dummy reader should never be called, so contains an assert on the
buffer length that should always trigger. Just abort() instead.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
These are all cases comparing a number of bytes written (int or
ssize_t) with a size_t, so casting to size_t is appropriate.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Change declarations of variable and parameters, usually loop variables
and limits, from int to unsigned int, size_t or uint32_t.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This improves readability. Also, the asserts involving this
expression get more complicated in the next commit, so this will keep
those asserts within a single line.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
... and does not just contain whitespace.
Otherwise NULL can be passed as the first argument to execv().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Instead of taking exclude_pnn as a parameter, calculate it from an
include_self_parameter, which is passed through from the 2 calling
functions.
While doing this, fix a signed/unsigned comparison issue by declaring
the new exclude_pnn local variable as an unsigned type.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The next commit will change the type of this function, which is only
used in this file. So, make it static to isolate the change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This has been broken for 10 years since commit
9616959bd6, which introduced the
separate filtering. This commit was missing a redirect of the output
of stderr_filter() to stderr.
Since nobody depends on the separate filtering (i.e. nobody reported a
bug), just return to combining stdout and stderr, and filtering them
together.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This filter no longer does anything useful in this context. By
default it adds a pipeline with trailing cat process. In many
contexts, stdout of the process being run is still open so the cat
process will stay around and will stop onnode from exiting.
The filters should all go away because they are simply an example of
code that is trying to be too clever while causing unfortunate corner
cases.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
I don't think anyone uses this and it causes complications.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Yes, the other callers check the return value of ctdb_lockdb_mark().
However, this is called in a void function and ctdb_lockdb_mark() has
already printed any error message. All we can do is explicitly ignore
the return value.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To make this much clearer, move the declaration into the scope where
it is used.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This fixes warnings about signed versus unsigned comparisons.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Simple cases where variables and function parameters need to be
declared as an unsigned type instead of an int.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Simple cases where variables need to be declared as an unsigned type
instead of an int.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Simple cases where a variables and function parameters need to be
declared as an unsigned type instead of an int.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The incremented value of argc is indeed never used. Leave it as a
comment to warn anyone cutting and pasting the code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
In one case, given triviality of change, add missing braces and fix
whitespace.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Simple cases where a variable (usually a loop variable) needs to be
declared as an unsigned type (unsigned int or size_t) instead of an
int.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
All the top-level callers pass size_t.
Drop the ternary operator. The value of hsize is always positive
because it is unsigned.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There's no point using unsigned here. tdb_traverse() returns an int
for the number of records traversed and the number of empty records
can't exceed this value.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This needs an extra variable because variable i has been used in both
signed and unsigned contexts.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
These are the simple cases where a variable (usually a loop variable)
needs to be declared as an unsigned type (usually unsigned int or
size_t) instead of an int.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
net.ipv4.tcp_tw_recycle has been removed from Linux 4.12 but, still,
makes sense to check its existence. Unfortunately, current check does
not test for the procfs file existence. This commit fixes the issue.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13984
Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Tue Jun 4 23:31:24 UTC 2019 on sn-devel-184
The caller should be able to call TALLOC_FREE() on the returned
strings.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Fixes
ctdb/server/ipalloc_lcp2.c:61: error: shiftTooManyBitsSigned: Shifting signed 32-bit value by 31 bits is undefined behaviour <--[cppcheck]
Signed-off-by: Noel Power <noel.power@suse.com>
Reviewed-by: Andreas Schneider <asn@samba.org>
Old war story completely from memory, I could not find the commit that
introduced TDB_SEQNUM so far...:
Back in the days when ctdb was initially developed, TDB_SEQNUM's only
user was the notify.tdb that held one huge record for all notify
records. With that use case in mind it made perfect sense to keep the
SEQNUM stable locally, sacrificing precision. By now notify.tdb is
long gone, an the only user of TDB_SEQNUM right now is brlock.tdb,
which contains special case code for the imprecise ctdb implementation
of TDB_SEQNUM.
With this commit, that special code can go: The TDB_SEQNUM will also
increment when just the DMASTER header field changes, indicating to
smbd that someone else might have changed the record. This will of
course increase the SEQNUM frequency, but it should not increase the
load on ctdb: If you look at the brlock.c workaround, it just does not
do the caching that is possible with precise TDB_SEQNUMs working.
How did I get here? I want to move brl_num_read_oplocks() from
brlock.tdb into locking.tdb, and for that I need precise TDB_SEQNUMs
for locking.tdb.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Fri May 24 00:42:17 UTC 2019 on sn-devel-184
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Thu May 23 18:08:36 UTC 2019 on sn-devel-184
state is always freed before exiting this function, so allocate fde
off it instead of long-lived ctdb context.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13943
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There is a chance that restoring IP addresses to the test node will
result in different IP addresses being assigned to that node.
Removing a single IP address may then fail (or be a no-op) if it is
done after the restore.
So, swap the single IP address removal to happen first, then restore,
then remove all IP addresses.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
ctdb reloadips will fail if it can't disable takover runs. The most
likely reason for this is that there is already a takeover run in
progress. We can't predict when this will happen, so retry if this
occurs.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Otherwise, when looping tests for a long time, nodes are unable to
connect to each other.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon May 13 08:42:44 UTC 2019 on sn-devel-184
ctdb_control_db_attach() and ctdb_control_db_detach() assume that any
control with client ID 0 comes from another daemon and treat it
specially.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13930
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2b5dbb3525 fixed builds with an explicit
--with-libcephfs but broke builds against system Ceph libraries. This
change handles both cases.
Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu May 9 04:24:56 UTC 2019 on sn-devel-184
sock_socket_start_recv() might not fill sockpath if we return early.
Found by GCC 9.
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
error: ‘%04d’ directive writing between 4 and 11 bytes into a region of
size 5 [-Werror=format-overflow=]
sprintf(key, "key%04d", i);
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
If the directory is always cleaned up then it is not possible to look
at daemon logs to debug test failures.
This target is only really used by autobuild.py, which (optionally)
cleans up the parent directory anyway.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue May 7 06:56:01 UTC 2019 on sn-devel-184
Since commit 0e9ead8f28 daemons have
been shut down after each test, so this option no longer has anything
to do with killing daemons.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Sometimes the detach test fails:
Check detaching single test database detach_test1.tdb
BAD: database detach_test1.tdb is still attached
Number of databases:4
dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.0/db/volatile/detach_test4.tdb.0
dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.0/db/volatile/detach_test3.tdb.0
dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.0/db/volatile/detach_test2.tdb.0
dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.0/db/volatile/detach_test1.tdb.0
Number of databases:3
dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.1/db/volatile/detach_test4.tdb.1
dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.1/db/volatile/detach_test3.tdb.1
dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.1/db/volatile/detach_test2.tdb.1
Number of databases:4
dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.2/db/volatile/detach_test4.tdb.2
dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.2/db/volatile/detach_test3.tdb.2
dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.2/db/volatile/detach_test2.tdb.2
dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.2/db/volatile/detach_test1.tdb.2
*** TEST COMPLETED (RC=1) AT 2019-04-27 03:35:40, CLEANING UP...
When issued from a client, the detach control re-broadcasts itself
asynchronously to all nodes and then returns success. The controls to
some nodes to do the actual detach may still be in flight when success
is returned to the client. Therefore, the test should wait for a few
seconds to allow the asynchronous controls to complete.
The same is true for the attach control, so workaround the problem in
the attach test too.
An alternative is to make the attach and detach controls synchronous
by avoiding the broadcast and waiting for the results of the
individual controls sent to the nodes. However, a simple
implementation would involve adding new nested event loops.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This sometimes fails, apparently due to a cat process in onnode
getting EAGAIN. The conclusion is that tests that process large
amounts of output should not depend on a sub-shell delivering that
output into a shell variable.
Change try_command_on_node() to leave all of the output in file
$outfile and just put the first 1KB into $out. $outfile is removed
after each test completes.
Change the implementation of sanity_check_output() to use $outfile
instead of $out.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
All callers are currently passed $out. Global variable $out is used
in many other places so use it here to simplify the interface and make
future changes simpler.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
CTDB's system memory monitoring in 05.system.script monitors both main
memory and swap. The swap monitoring was originally based on
the (possibly incorrect, see below) idea that swap space stacks on top
of main memory, so that when a system starts filling swap space then
this is supposed to be a good sign that the system is running out of
memory. Additionally, performance on a Linux system tends to be
destroyed by the I/O associated with a lot of swapping to spinning
disks.
However, some platforms default to creating only 4GB of swap space
even when there is 128GB of main memory. With such a small swap to
main memory ratio, memory pressure can force swap to be nearly full
even when a significant amount of main memory is still available and
the system is performing well. This suggests that checking swap
utilisation might be less than useful in many circumstances.
So, remove the separate swap space checking and change the memory
check to cover the total of main memory and swap space.
Test function set_mem_usage() still takes an argument for each of main
memory and swap space utilisation. For simplicity, the same number is
now passed twice to make the intended results comprehensible. This
could be changed later.
A couple of tests are cleaned up to no longer use hard-coded
/proc/meminfo and ps output.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is to help us notice when ctdbd is using the full capacity of a
CPU, so is saturated.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13895
In run_proc, there was an implicit assumption that when a process exits,
fd event (pipe between parent and child) would be processed first and
signal event (SIGCHLD for the child) would be processed later.
However, that is not the case. SIGCHLD can be received asynchronously
any time even when the pipe data has not fully been read. This causes
run_proc to miss some of the output from child process in tests.
When SIGCHLD is being processed, if the pipe between parent and child is
still open, then do an explict read from the pipe to ensure we read any
data still in the pipe before closing the pipe.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Apr 12 08:19:29 UTC 2019 on sn-devel-144
We also can not assume that nodes can be marked as connected via only
the keepalive mechanism. Keepalives are not sent to disconnected
nodes so, in the absence of other packets (e.g. broadcasts), 2 nodes
may never become marked as connected to each other.
Revert to marking nodes as connected in the TCP transport code. If a
connection is to a non(-operational) ctdbd then it will revert to
disconnected after a short while and may actually flap. This should
be rare.
This reverts commit 66919db3d7.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13888
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
I thought this was being triggered during automated testing.
However, it appears that a poor choice of fixed ports for NFS RPC
services was the real problem. Revert, since the original behaviour
may be useful.
This reverts commit f1a1c300e1.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The new string conversion wrappers detect and flag errors
which occured during the string to integer conversion.
Those modifications required an update of the callees
error checks.
Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Christof Schmitt <cs@samba.org>
The new string conversion wrappers detect and flag errors
which occured during the string to integer conversion.
Those modifications required an update of the callees
error checks.
Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Christof Schmitt <cs@samba.org>
If a data packet arrives which exceeds the queue's current buffer size,
the buffer needs to be increased to hold the full packet. Once the packet
is processed the buffer size should be decreased to its standard size again.
This test case verifies this process.
Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Christof Schmitt <cs@samba.org>
Autobuild-User(master): Christof Schmitt <cs@samba.org>
Autobuild-Date(master): Wed Apr 10 00:17:37 UTC 2019 on sn-devel-144
Some test scenarios require access to the created queue.
Prepare the test_setup function to provide it as additional parameter.
Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Christof Schmitt <cs@samba.org>
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Sat Apr 6 11:51:55 UTC 2019 on sn-devel-144
While 0 may indicate that all threads have exited after being stuck,
it may also indicate that nfsd should not be running due to being shut
down.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Sun Mar 31 11:47:44 UTC 2019 on sn-devel-144
The alternative seems to be to try something via CTDB_NFS_CALLOUT.
That would be complicated and seems like overkill for something this
simple.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
The situation for NFS config has got more complicated and is probably
broken in statd-callout on Debian-like systems at the moment. Allow
several alternative configuration names to be tried. Stop after the
first that is found and loaded.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
The tests are written around the default of sysvinit-redhat. Add
support for systemd-redhat.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
At least Red Hat and Debian appear to use (a variant of?) the upstream
systemd units for NFS, so adding support for these services is
relatively easy. Distributions using Sys-V init can patch the
call-out to use the relevant Sys-V init services.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
When an NFS check restarts a failed service by hand then systemd will
be unable to stop or start this service again because (at least) the
PID file will be wrong. Do this via the NFS Linux kernel call-out
instead. Allow the call-out to use the services instead of doing
manual restarts. Add variables for mount, status and rquotad services
to support this.
Adding systemd NFS services to the call-out will follow.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
There will be more of these variable for other services so, for
readability, it makes sense for them to start with "nfs_".
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
We do this by removing the confusing mandatory option to
conf.SAMBA_CHECK_PYTHON{,_HEADERS}(), instead just use the value of
--disable-python internally
This follows the default minimum of Python 3.4 and keeps things consistent
with the main Samba build where --disable-python is required to skip building
python bindings.
Signed-off-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
We now used the default of 3.4 from conf.SAMBA_CHECK_PYTHON()
Signed-off-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
There is no need to write SAMBA_VERSION_STRING as CTDB_VERSION_STRING.
Wherever required use SAMBA_VERSION_STRING directly.
Avoids the confusion with two version.h files.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13789
Signed-off-by: Amitay Isaacs <amitay@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Mar 15 06:31:50 UTC 2019 on sn-devel-144
This way we don't get constant rebuild as SAMBA_VERSION_STRING
is "4.7.0pre1.DEVELOPERBUILD" for the binaries under bin/
instead of "4.7.0pre1.GIT.59e51f6".
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13789
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This can be used to test the version checking logic. Cache the
version to avoid re-checking the environment variable each time.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
This file is included by local_daemons.sh, which is not a bash script
and wait_until() uses the "local" keyword. Prefixing variable names
with '_' to indicate that they are local changes a lot of lines in
this function. So, fix indentation, potential quoting problems and
other ShellCheck hits while touching this function.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
This isn't used anywhere that requires it to be exported, but the lack
of consistency will cause problems and confusion at some later stage.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
ctdb_sock_addr values are hashed in some contexts. This means that
all of the memory used for the ctdb_sock_addr should be consistent
regardless of how parsing is done. The first 2 cases are just sanity
checks but the 3rd case involving an IPv4-mapped IPv6 address is the
real target of this test addition.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13839
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
Failed to kill the tcp connection that using IPv4-mapped IPv6 address
(e.g. ctdb_killtcp eth0 ::ffff:192.168.200.44:2049
::ffff:192.168.200.45:863).
When the ctdb_killtcp is used to kill the tcp connection, the IPs and
ports in the connection will be parsed to conn.client and conn.server
(call stack: main->ctdb_sock_addr_from_string->ip_from_string). In
the ip_from_string, as we are using IPv4-mapped IPv6 addresses, the
ipv6_from_string will be used to parse ip to addr.ip6 first. The next
step the ipv4_from_string will be used to reparse ip to addr.ip.
As a result, the data that dump from conn.server is "2 0 8 1 192 168
200 44 0 0 0 0 0 0 0 0 0 0 255 255 192 168 200 44 0 0 0 0", the data
from conn.client is "2 0 3 95 192 168 200 45 0 0 0 0 0 0 0 0 0 0 255 255
192 168 200 45 0 0 0 0". The connection will be add to conn_list by
ctdb_connection_list_add. Then the reset_connections_send uses conn_list
as parameter to start to reset connections in the conn_list.
In the reset_connections_send, the database "connections" will be
created. The connections from conn_list will be written to the
database(call db_hash_add), and use the data that dump from conn_client
and conn_server as key.
In the reset_connections_capture_tcp_handler, the
ctdb_sys_read_tcp_packet will receive data on the raw socket. And
extract the IPs and ports from the tcp packet. when extracting IP and
port, the tcp4_extract OR tcp6_extract will be used. Then we got the
new conn.client and conn.server. the data that dump from the
conn.server is "2 0 8 1 192 168 200 44 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0", the data from conn.client is "2 0 3 95 192 168 200 45 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0". Finally, we use the data as key to check
if this connection is one being reset(call db_hash_delete). The
db_hash_delete will return ENOENT. Because the two key that being used
by db_hash_delete and db_hash_add are different.
So, the TCP RST will be NOT sent for the connection forever. We should
initialize addr struct to zero before reparsing as IPV4 in the
ip_from_string.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13839
Signed-off-by: Zhu Shangzhong <zhu.shangzhong@zte.com.cn>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>