1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-10 01:18:15 +03:00
Commit Graph

8553 Commits

Author SHA1 Message Date
Martin Schwenke
5a702b01f6 ctdb-tests: Fix return value of DB test tool delete command
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-12 03:11:39 +00:00
Martin Schwenke
a40fc709cc ctdb-tcp: Make error handling for outbound connection consistent
If we can't bind the local end of an outgoing connection then
something has gone wrong.  Retrying is better than failing into a
zombie state.  The interface might come back up and/or the address my
be reconfigured.

While here, do the same thing for the other (potentially transient)
failures.

The unknown address family failure is special but just handle it via a
retry.  Technically it can't happen because the node address parsing
can only return values with address family AF_INET or AF_INET6.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14274

Reported-by: 耿纪超 <gengjichao@jd.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-12 03:11:39 +00:00
Martin Schwenke
0b3db29bd5 ctdb-tests: Add some tool unit tests to ensure that timeouts work
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Feb 10 05:34:08 UTC 2020 on sn-devel-184
2020-02-10 05:34:08 +00:00
Martin Schwenke
0e59cd25e1 ctdb-tools: Allow shorter runtime limit to be specified
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
39206fd327 ctdb-tools: When in test mode set process group in top-level ctdb tool
If ctdbd hangs when shutting down in post-test clean-up then killing
the process group can kill the test.  When in test mode, create a
process group but only in the top-level ctdb tool - the natgw and lvs
helpers also run the ctdb tool.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
3b0b830e40 ctdb-tests: Use $PWD/bin/ if it exists when running in-tree
When running tests from a top-level build, a stale build in ctdb/bin/
will be preferred and may cause confusing results.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
b0b14e4edd ctdb-tests: Make $ctdb_dir absolute
This is used to set several variables so it might as well be cd-proof.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
1a0e1f8924 ctdb-daemon: Fork when not interactive and test mode is enabled
There is no sane way of keeping stdin open when using the shell to
background ctdbd in local_daemons.sh.  Instead, have ctdbd fork when
not interactive and when test mode is enabled.  become_daemon() can't
be used for this: if it forks then it also closes stdin.

For the interactive case, become_daemon() wasn't doing anything
special, so do nothing instead.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
a220e9454a ctdb-daemon: Make some conditions more explicit
These don't need to depend on do_fork.  Child logging should be set up
whenever the daemon is not interactive.  The stdin handler should be
setup whenever test mode is enabled.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
cefb3327c6 ctdb-daemon: Pass more information to ctdb_start_daemon()
No functional changes.

This is staging for a change that makes ctdbd fork when test mode is
enabled but interactive is not set.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:38 +00:00
Martin Schwenke
3509aa28d4 ctdb-tests: Don't actually close stdin in fake ssh
A subsequent file descriptor allocation may return 0 and unexpected
things may then happen.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:38 +00:00
Martin Schwenke
8de1bb75e5 ctdb-tests: Redirect stdin from /dev/null when running a test
Otherwise, if the test is run via ssh it will "unexpectedly" find
itself at the other end of a pipe.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:38 +00:00
Martin Schwenke
0737849b90 Revert "ctdb-tests: Enable job control when keeping stdin open"
This doesn't work when stdin is not a tty.

This reverts commit ea754bfdec.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:38 +00:00
Volker Lendecke
3d40efaed8 ctdb-test: Fix a typo
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>

Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Thu Jan 30 13:53:22 UTC 2020 on sn-devel-184
2020-01-30 13:53:22 +00:00
Martin Schwenke
ea754bfdec ctdb-tests: Enable job control when keeping stdin open
POSIX says:

  If job control is disabled (see set, -m), the standard input for an
  asynchronous list, before any explicit redirections are performed,
  shall be considered to be assigned to a file that has the same
  properties as /dev/null. This shall not happen if job control is
  enabled. In all cases, explicit redirection of standard input shall
  override this activity.

ctdbd is backgrounded at startup, so the above causes stdin to be
redirected from /dev/null.  Enable job control to work around this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Jan 28 11:24:35 UTC 2020 on sn-devel-184
2020-01-28 11:24:35 +00:00
Martin Schwenke
2380b13bf8 ctdb-tests: Don't close stdin when starting local daemons
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-01-28 09:57:33 +00:00
Martin Schwenke
cf460bd9c4 ctdb-daemon: Shut down if interactive and stdin is closed
This allows a test environment to simply close its end of a pipe to
cleanly shutdown ctdbd.  Like in smbd, this is only done if stdin is a
pipe or a socket.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-01-28 09:57:32 +00:00
Martin Schwenke
d79e2dcfc8 ctdb-daemon: Only stop monitoring if it has been initialised
This avoids a crash if ctdb_shutdown_sequence() is called before
monitoring is initialised.

Switch to using TALLOC_FREE() while touching this function.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-01-28 09:57:32 +00:00
Martin Schwenke
aa2977e151 ctdb-mutex: Change default re-check time for fcntl helper to 5s
Testing against a commonly used cluster filesystem has shown no
performance impact, as expected.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-01-21 11:39:40 +00:00
Martin Schwenke
14b1dffc27 ctdb-tests: Add some tests to check recovery from recovery lock issues
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-01-21 11:39:40 +00:00
Martin Schwenke
64501f5193 ctdb-tests: Put recovery lock for local daemons into a subdirectory
This makes it more like the way it works with a cluster filesystem.
It also allows the subdirectory to be manipulated in tests.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-01-21 11:39:40 +00:00
Martin Schwenke
93fc31858f ctdb-tests: Add local_daemons.sh option for recovery lock recheck interval
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-01-21 11:39:40 +00:00
Volker Lendecke
42a3e2e503 ctdbd: Use struct initialization
2 lines less

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2020-01-19 18:29:39 +00:00
Martin Schwenke
9edf15afc2 ctdb-tests: Skip some tests that don't work with IPv6
See the comments added to the tests.

It may be possible to rewrite these so they do something sane for
IPv6... some other time.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14227

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jan  3 00:00:55 UTC 2020 on sn-devel-184
2020-01-03 00:00:55 +00:00
Martin Schwenke
693080abe4 ctdb-scripts: Strip square brackets when gathering connection info
ss added square brackets around IPv6 addresses in versions > 4.12.0
via commit aba9c23a6e1cb134840c998df14888dca469a485.  CentOS 7 added
this feature somewhere mid-release.  So, backward compatibility is
obviously needed.

As per the comment protocol/protocol_util.c should probably print and
parse such square brackets.  However, for backward compatibility the
brackets would have to be stripped in both places in
update_tickles()...  or added to the ss output when missing.  Best to
leave this until we have a connection tracking daemon.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14227

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-01-02 22:36:34 +00:00
Amitay Isaacs
963a639101 ctdb-tests: Add tests for cmdline_add() api
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Nov 14 12:03:46 UTC 2019 on sn-devel-184
2019-11-14 12:03:46 +00:00
Amitay Isaacs
e469d6c119 ctdb-common: Add api to add new section/commands to cmdline
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-11-14 10:38:34 +00:00
Amitay Isaacs
977a6f7fad ctdb-common: Change cmdline implementation to support multiple sections
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-11-14 10:38:34 +00:00
Amitay Isaacs
7a008c6b74 ctdb-tests: Update cmdline tests for section name
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-11-14 10:38:34 +00:00
Amitay Isaacs
b2b24c91fa ctdb-common: Add section to group commands in cmdline
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-11-14 10:38:34 +00:00
Amitay Isaacs
29948d7b1e ctdb-common: Generate usage message from cmdline_parse()
If any of the option parsing or command parsing fails, generate usage
message.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-11-14 10:38:34 +00:00
Martin Schwenke
e45feaf28d ctdb-tcp: Simplify freeing of transport data on shutdown
The type-checking is superfluous and gets in the way of readability.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Nov 14 03:45:44 UTC 2019 on sn-devel-184
2019-11-14 03:45:44 +00:00
Martin Schwenke
750f3938e4 ctdb-daemon: Rename ctdb_context private_data to transport_data
This gives a casual reader a useful clue.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-11-14 02:20:46 +00:00
Martin Schwenke
53f8492caa ctdb-daemon: Rename ctdb_node private_data to transport_data
This gives a casual reader a useful clue.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-11-14 02:20:46 +00:00
Volker Lendecke
a6d99d9e5c ctdb-tcp: Close inflight connecting TCP sockets after fork
Commit c68b6f96f2 changed the talloc hierarchy such that outgoing TCP sockets
while sitting in the async connect() syscall are not freed via
ctdb_tcp_shutdown() anymore, they are hanging off a longer-running structure.
Free this structure as well.

If an outgoing TCP socket leaks into a long-running child process (possibly the
recovery daemon), this connection will never be closed as seen by the
destination node. Because with recent changes incoming connections will not be
accepted as long as any incoming connection is alive, with that socket leak
into the recovery daemon we will never again be able to successfully connect to
the node that is affected by this leak. Further attempts to connect will be
discarded by the destination as long as the recovery daemon keeps this socket
alive.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175
RN: Avoid communication breakdown on node reconnect

Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-11-14 02:20:46 +00:00
Amitay Isaacs
816205027a ctdb-ib: Fix build errors for infiniband transport
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Nov 13 13:31:10 UTC 2019 on sn-devel-184
2019-11-13 13:31:10 +00:00
Andrew Bartlett
92ce387ed0 build: Remove workaround for missing os.path.relpath in Python < 2.6
Signed-off-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: David Mulder <dmulder@suse.com>
Reviewed-by: Andreas Schneider <asn@samba.org>
2019-11-13 08:42:30 +00:00
Volker Lendecke
f5f89b1b99 ctdb: Use TALLOC_FREE() in a few places
We have a macro for NULLing out the pointer

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Nov  8 01:35:11 UTC 2019 on sn-devel-184
2019-11-08 01:35:11 +00:00
Martin Schwenke
bf99f82077 ctdb-tests: Make process exists test more resilient
This can fail as follows:

--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--
Running test ./tests/UNIT/tool/ctdb.process-exists.003.sh (02:26:30)
--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--
ctdb.process-exists.003      - ctdbd process with multiple connections on node 0
Setting up fake ctdbd
<10||0|
OK
<10|PID 26107 exists
|0|
OK
==================================================
Running "ctdb -d NOTICE process-exists 26107 0x1234567812345678"
PASSED
==================================================
Running "ctdb -d NOTICE process-exists 26107 0xaebbccdd12345678"
Registered SRVID 0xaebbccdd12345678
--------------------------------------------------
Output (Exit status: 1):
--------------------------------------------------
PID 26107 with SRVID 0xaebbccdd12345678 does not exist
--------------------------------------------------
Required output (Exit status: 0):
--------------------------------------------------
PID 26107 with SRVID 0xaebbccdd12345678 exists

FAILED
connection to daemon closed, exiting
==========================================================================
TEST FAILED: ./tests/UNIT/tool/ctdb.process-exists.003.sh (status 1) (duration: 0s)
==========================================================================

This happens when dummy_client has not registered the SRVID (for its
10th connection) before the 2nd simple_test.

Change the initial wait to ensure that the SRVID is registered.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Nov  6 02:46:24 UTC 2019 on sn-devel-184
2019-11-06 02:46:24 +00:00
Martin Schwenke
dd9d5ec5c8 ctdb-tests: Improve code quality in ctdb_init()
Improve quoting and indentation.  Print a clear error if the cluster
goes back into recovery and doesn't come back out.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-11-06 01:22:30 +00:00
Martin Schwenke
3b5ed00054 ctdb-tests: No longer retry starting the cluster
Retrying like this hides bugs.  The cluster should come up first time,
every time.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-11-06 01:22:30 +00:00
Martin Schwenke
bf47bc18bb ctdb-tcp: Drop tracking of file descriptor for incoming connections
This file descriptor is owned by the incoming queue.  It will be
closed when the queue is torn down.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175
RN: Avoid communication breakdown on node reconnect

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-11-06 01:22:30 +00:00
Martin Schwenke
d0baad257e ctdb-tcp: Avoid orphaning the TCP incoming queue
CTDB's incoming queue handling does not check whether an existing
queue exists, so can overwrite the pointer to the queue.  This used to
be harmless until commit c68b6f96f2
changed the read callback to use a parent structure as the callback
data.  Instead of cleaning up an orphaned queue on disconnect, as
before, this will now free the new queue.

At first glance it doesn't seem possible that 2 incoming connections
from the same node could be processed before the intervening
disconnect.  However, the incoming connections and disconnect occur on
different file descriptors.  The queue can become orphaned on node A
when the following sequence occurs:

1. Node A comes up
2. Node A accepts an incoming connection from node B
3. Node B processes a timeout before noticing that outgoing the queue is writable
4. Node B tears down the outgoing connection to node A
5. Node B initiates a new connection to node A
6. Node A accepts an incoming connection from node B

Node A processes then the disconnect of the old incoming connection
from (2) but tears down the new incoming connection from (6).  This
then occurs until the originally affected node is restarted.

However, due to the number of outgoing connection attempts and
associated teardowns, this induces the same behaviour on the
corresponding incoming queue on all nodes that node A attempts to
connect to.  Therefore, other nodes become affected and need to be
restarted too.

As a result, the whole cluster probably needs to be restarted to
recover from this situation.

The problem can occur any time CTDB is started on a node.

The fix is to avoid accepting new incoming connections when a queue
for incoming connections is already present.  The connecting node will
simply retry establishing its outgoing connection.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-11-06 01:22:30 +00:00
Martin Schwenke
e62b3a05a8 ctdb-tcp: Check incoming queue to see if incoming connection is up
This makes it consistent with the reverse case.  Also, in_fd will soon
be removed.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-11-06 01:22:30 +00:00
Björn Jacke
a456c2bb02 ctdb/utils/smnotify/smnotify.c: typo fixes
Signed-off-by: Bjoern Jacke <bjacke@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-10-31 00:43:39 +00:00
Björn Jacke
1e73161bdd ctdb/utils/scsi_io/scsi_io.c: typo fixes
Signed-off-by: Bjoern Jacke <bjacke@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-10-31 00:43:39 +00:00
Björn Jacke
f3754b6487 ctdb/server/ctdb_daemon.c: typo fixes
Signed-off-by: Bjoern Jacke <bjacke@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-10-31 00:43:39 +00:00
Björn Jacke
5d2a257c2e ctdb/server/ctdb_client.c: typo fixes
Signed-off-by: Bjoern Jacke <bjacke@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-10-31 00:43:39 +00:00
Björn Jacke
7722bd80fc ctdb/server/ctdb_call.c: typo fixes
Signed-off-by: Bjoern Jacke <bjacke@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-10-31 00:43:38 +00:00
Björn Jacke
9fa37484c3 ctdb/include/ctdb_private.h: typo fixes
Signed-off-by: Bjoern Jacke <bjacke@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-10-31 00:43:38 +00:00