samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-20 14:03:59 +03:00

Author	SHA1	Message	Date
Volker Lendecke	92b73cf0bf	ctdb-tcp: Close inflight connecting TCP sockets after fork Commit c68b6f96f26 changed the talloc hierarchy such that outgoing TCP sockets while sitting in the async connect() syscall are not freed via ctdb_tcp_shutdown() anymore, they are hanging off a longer-running structure. Free this structure as well. If an outgoing TCP socket leaks into a long-running child process (possibly the recovery daemon), this connection will never be closed as seen by the destination node. Because with recent changes incoming connections will not be accepted as long as any incoming connection is alive, with that socket leak into the recovery daemon we will never again be able to successfully connect to the node that is affected by this leak. Further attempts to connect will be discarded by the destination as long as the recovery daemon keeps this socket alive. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175 RN: Avoid communication breakdown on node reconnect Signed-off-by: Martin Schwenke <martin@meltin.net> Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit a6d99d9e5c5bc58e6d56be7a6c1dbc7c8d1a882f) Autobuild-User(v4-9-test): Karolin Seeger <kseeger@samba.org> Autobuild-Date(v4-9-test): Wed Nov 20 14:58:33 UTC 2019 on sn-devel-144	2019-11-20 14:58:32 +00:00
Martin Schwenke	0dcb2efb8f	ctdb-tcp: Drop tracking of file descriptor for incoming connections This file descriptor is owned by the incoming queue. It will be closed when the queue is torn down. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit bf47bc18bb8a94231870ef821c0352b7a15c2e28)	2019-11-20 11:15:25 +00:00
Martin Schwenke	14406d123a	ctdb-tcp: Avoid orphaning the TCP incoming queue CTDB's incoming queue handling does not check whether an existing queue exists, so can overwrite the pointer to the queue. This used to be harmless until commit c68b6f96f26664459187ab2fbd56767fb31767e0 changed the read callback to use a parent structure as the callback data. Instead of cleaning up an orphaned queue on disconnect, as before, this will now free the new queue. At first glance it doesn't seem possible that 2 incoming connections from the same node could be processed before the intervening disconnect. However, the incoming connections and disconnect occur on different file descriptors. The queue can become orphaned on node A when the following sequence occurs: 1. Node A comes up 2. Node A accepts an incoming connection from node B 3. Node B processes a timeout before noticing that outgoing the queue is writable 4. Node B tears down the outgoing connection to node A 5. Node B initiates a new connection to node A 6. Node A accepts an incoming connection from node B Node A processes then the disconnect of the old incoming connection from (2) but tears down the new incoming connection from (6). This then occurs until the originally affected node is restarted. However, due to the number of outgoing connection attempts and associated teardowns, this induces the same behaviour on the corresponding incoming queue on all nodes that node A attempts to connect to. Therefore, other nodes become affected and need to be restarted too. As a result, the whole cluster probably needs to be restarted to recover from this situation. The problem can occur any time CTDB is started on a node. The fix is to avoid accepting new incoming connections when a queue for incoming connections is already present. The connecting node will simply retry establishing its outgoing connection. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit d0baad257e511280ff3e5c7372c38c43df841070)	2019-11-20 11:15:25 +00:00
Martin Schwenke	20b823fc25	ctdb-tcp: Check incoming queue to see if incoming connection is up This makes it consistent with the reverse case. Also, in_fd will soon be removed. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit e62b3a05a874db13a848573d2e2fb1c157393b9c)	2019-11-20 11:15:25 +00:00
Amitay Isaacs	6024163e17	ctdb-vacuum: Process all records not deleted on a remote node This currently skips the last record. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14147 RN: Avoid potential data loss during recovery after vacuuming error Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> (cherry picked from commit 33f1c9d9654fbdcb99c23f9d23c4bbe2cc596b98)	2019-10-16 12:16:21 +00:00
Martin Schwenke	9a5bdc6c9e	ctdb-tools: Stop deleted nodes from influencing ctdb nodestatus exit code Deleted nodes should simply be ignored. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14129 RN: Stop deleted nodes from influencing ctdb nodestatus exit code Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 32b5ceb31936ec5447362236c1809db003561d29) Autobuild-User(v4-9-test): Karolin Seeger <kseeger@samba.org> Autobuild-Date(v4-9-test): Fri Sep 20 14:09:11 UTC 2019 on sn-devel-144	2019-09-20 14:09:11 +00:00
Ralph Boehme	b9f1be5cf4	ctdb: fix compilation on systems with glibc robust mutexes On older systems like SLES 11 without POSIX robust mutexes, but with glib robust mutexes where all the functions are available but have a "_np" suffix, compilation fails in: ctdb/tests/src/test_mutex_raw.c.239.o: In function `worker': /root/samba-4.10.6/bin/default/../../ctdb/tests/src/test_mutex_raw.c:129: undefined reference to `pthread_mutex_consistent' ctdb/tests/src/test_mutex_raw.c.239.o: In function `main': /root/samba-4.10.6/bin/default/../../ctdb/tests/src/test_mutex_raw.c:285: undefined reference to `pthread_mutex_consistent' /root/samba-4.10.6/bin/default/../../ctdb/tests/src/test_mutex_raw.c:332: undefined reference to `pthread_mutexattr_setrobust' /root/samba-4.10.6/bin/default/../../ctdb/tests/src/test_mutex_raw.c:363: undefined reference to `pthread_mutex_consistent' collect2: ld returned 1 exit status This could be fixed by using libreplace system/threads.h instead of pthreads.h directly, but as there has been a desire to keep test_mutex_raw.c standalone and compilable without other external depenencies then libc and libpthread, make the tool developer build only. This should get the average user over the cliff. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14038 RN: Fix compiling ctdb on older systems lacking POSIX robust mutexes Signed-off-by: Ralph Boehme <slow@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> (cherry picked from commit f5388f97792ac2d7962950dad91aaf8ad49bceaa) Autobuild-User(v4-9-test): Karolin Seeger <kseeger@samba.org> Autobuild-Date(v4-9-test): Thu Sep 5 16:12:34 UTC 2019 on sn-devel-144	2019-09-05 16:12:34 +00:00
Martin Schwenke	745052cb6b	ctdb-recoverd: Fix typo in previous fix BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Aug 27 15:29:11 UTC 2019 on sn-devel-184 (cherry picked from commit 8190993d99284162bd8699780248bb2edfec2673)	2019-09-03 12:05:40 +00:00
Martin Schwenke	89b08e4fbc	ctdb-tests: Clear deleted record via recovery instead of vacuuming This test has been flapping because sometimes the record is not vacuumed within the expected time period, perhaps even because the check for the record can interfere with vacuuming. However, instead of waiting for vacuuming the record can be cleared by doing a recovery. This should be much more reliable. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 RN: Fix flapping CTDB tests Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Aug 21 13:06:57 UTC 2019 on sn-devel-184 (backported from commit 71ad473ba805abe23bbe6c1a1290612e448e73f3) Signed-off-by: Martin Schwenke <martin@meltin.net>	2019-09-03 12:05:40 +00:00
Martin Schwenke	4cbd3cd970	ctdb-tests: Strengthen volatile DB traverse test Check the record count more often, from multiple nodes. Add a case with multiple records. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit ca4df06080709adf0cbebc95b0a70b4090dad5ba)	2019-09-03 12:05:40 +00:00
Martin Schwenke	3801c9582b	ctdb-recoverd: Only check for LMASTER nodes in the VNN map BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 5d655ac6f2ff82f8f1c89b06870d600a1a3c7a8a)	2019-09-03 12:05:39 +00:00
Martin Schwenke	68cc58437f	ctdb-tests: Don't retrieve the VNN map from target node for notlmaster Use the VNN map from the node running node_has_status(). This means that wait_until_node_has_status 1 notlmaster 10 0 will run "ctdb status" on node 0 and check (for up to 10 seconds) if node 1 is in the VNN map. If the LMASTER capability has been dropped on node 1 then the above will wait for the VNN map to be updated on node 0. This will happen as part of the recovery that is triggered by the change of LMASTER capability. The next command will then only be able to attach to $TESTDB after the recovery is complete thus guaranteeing a sane state for the test to continue. This stops simple/79_volatile_db_traverse.sh from going into recovery during the traverse or at some other inconvenient time. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 53daeb2f878af1634a26e05cb86d87e2faf20173)	2019-09-03 12:05:39 +00:00
Martin Schwenke	31066fde8c	ctdb-tests: Handle special cases first and return All the other cases involve matching bits. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit bff1a3a548a2cace997b767d78bb824438664cb7)	2019-09-03 12:05:39 +00:00
Martin Schwenke	c3f2c55320	ctdb-tests: Inline handling of recovered and notlmaster statuses BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit bb59073515ee5f7886b5d9a20d7b2805857c2708)	2019-09-03 12:05:38 +00:00
Martin Schwenke	cf39c0fc3b	ctdb-tests: Drop unused node statuses frozen/unfrozen Silently drop unused local variable mpat. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 9b09a87326af28877301ad27bcec5bb13744e2b6)	2019-09-03 12:05:38 +00:00
Martin Schwenke	fd8a55bb3f	ctdb-tests: Reformat node_has_status() Re-indent and drop non-POSIX left-parenthesis from case labels. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 52227d19735a3305ad633672c70385f443f222f0)	2019-09-03 12:05:37 +00:00
Martin Schwenke	fcf29cda0e	ctdb-daemon: Make node inactive in the NODE_STOP control Currently some of this is supported by a periodic check in the recovery daemon's main_loop(), which notices the flag change, sets recovery mode active and freezes databases. If STOP_NODE returns immediately then the associated recovery can complete and the node can be continued before databases are actually frozen. Instead, immediately do all of the things that make a node inactive. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087 RN: Stop "ctdb stop" from completing before freezing databases Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Aug 20 08:32:27 UTC 2019 on sn-devel-184 (cherry picked from commit e9f2e205ee89f4f3d6302cc11b4d0eb2efaf0f53) Autobuild-User(v4-9-test): Karolin Seeger <kseeger@samba.org> Autobuild-Date(v4-9-test): Wed Aug 28 12:04:13 UTC 2019 on sn-devel-144	2019-08-28 12:04:13 +00:00
Martin Schwenke	fa705bc7de	ctdb-daemon: Drop unused function ctdb_local_node_got_banned() BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 91ac4c13d8472955d1f04bd775ec4b3ff8bf1b61)	2019-08-28 07:36:30 +00:00
Martin Schwenke	c2ee9bbeee	ctdb-daemon: Switch banning code to use ctdb_node_become_inactive() There's no reason to avoid immediately setting recovery mode to active and initiating freeze of databases. This effectively reverts the following commits: d8f3b490bbb691c9916eed0df5b980c1aef23c85 b4357a79d916b1f8ade8fa78563fbef0ce670aa9 The latter is now implemented using a control, resulting in looser coupling. See also the following commit: f8141e91a693912ea1107a49320e83702a80757a BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 0f5f7b7cf4e970f3f36c5e0b3d09e710fe90801a)	2019-08-28 07:36:30 +00:00
Martin Schwenke	13780a3ee0	ctdb-daemon: Factor out new function ctdb_node_become_inactive() This is a superset of ctdb_local_node_got_banned() so will replace that function, and will also be used in the NODE_STOP control. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit a42bcaabb63722411bee52b80cbfc795593defbc)	2019-08-28 07:36:30 +00:00
Martin Schwenke	f4442942fb	ctdb-tcp: Mark node as disconnected if incoming connection goes away To make it easy to pass the node data to the upcall, the private data for ctdb_tcp_read_cb() needs to be changed from tnode to node. RN: Avoid marking a node as connected before it can receive packets BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri Aug 16 22:50:35 UTC 2019 on sn-devel-184 (cherry picked from commit 73c850eda4209b688a169aeeb20c453b738cbb35)	2019-08-28 07:36:30 +00:00
Martin Schwenke	1e45ab3c23	ctdb-tcp: Only mark a node connected if both directions are up Nodes are currently marked as up if the outgoing connection is established. However, if the incoming connection is not yet established then this node could send a request where the replying node can not queue its reply. Wait until both directions are up before marking a node as connected. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 8c98c10f242bc722beffc711e85c0e4f2e74cd57)	2019-08-28 07:36:30 +00:00
Martin Schwenke	9155ad23d4	ctdb-tcp: Create outbound queue when the connection becomes writable Since commit ddd97553f0a8bfaada178ec4a7460d76fa21f079 ctdb_queue_send() doesn't queue a packet if the connection isn't yet established (i.e. when fd == -1). So, don't bother creating the outbound queue during initialisation but create it when the connection becomes writable. Now the presence of the queue indicates that the outbound connection is up. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 7f4854d9643a096a6d8a354fcd27b7c6ed24a75e)	2019-08-28 07:36:30 +00:00
Martin Schwenke	f2ce6c745c	ctdb-tcp: Use TALLOC_FREE() BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit d80d9edb4dc107b15a35a39e5c966a3eaed6453a)	2019-08-28 07:36:30 +00:00
Martin Schwenke	b21bc19bae	ctdb-tcp: Move incoming fd and queue into struct ctdb_tcp_node This makes it easy to track both incoming and outgoing connectivity states. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit c68b6f96f26664459187ab2fbd56767fb31767e0)	2019-08-28 07:36:30 +00:00
Martin Schwenke	17f1a95203	ctdb-tcp: Rename fd -> out_fd in_fd is coming soon. Fix coding style violations in the affected and adjacent lines. Modernise some debug macros and make them more consistent (e.g. drop logging of errno when strerror(errno) is already logged. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit c06620169fc178ea6db2631f03edf008285d8cf2)	2019-08-28 07:36:30 +00:00
Martin Schwenke	a8dd1a0577	ctdb-daemon: Add function ctdb_ip_to_node() This is the core logic from ctdb_ip_to_pnn(), so re-implement that that function using ctdb_ip_to_node(). Something similar (ctdb_ip_to_nodeid()) was recently removed in commit 010c1d77cd7e192b1fff39b7b91fccbdbbf4a786 because it wasn't required. Now there is a use case. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 3acb8e9d1c854b577d6be282257269df83055d31)	2019-08-28 07:36:30 +00:00
Martin Schwenke	a309b862e8	ctdb-daemon: Replace function ctdb_ip_to_nodeid() with ctdb_ip_to_pnn() Node ID is a poorly defined concept, indicating the slot in the node map where the IP address was found. This signed value also ends up compared to num_nodes, which is unsigned, producing unwanted warnings. Just return the PNN because this what both callers really want. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 010c1d77cd7e192b1fff39b7b91fccbdbbf4a786)	2019-08-28 07:36:30 +00:00
Rafael David Tinoco	de909ff886	ctdb-config: depend on /etc/ctdb/nodes file CTDB should start as a disabled unit (systemd) in most of the distributions and, when trying to enable it for the first time, user should get an unconfigured, or similar, error. Depending on /etc/ctdb/nodes file will give a clear direction to final user on what is needed in order to get cluster up and running. It should work like previous ENABLED=NO variables in SySV like initialization scripts. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14017 RN: ctdb.service should only start if /etc/ctdb/nodes is not empty Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit c5803507df7def388edcd5b6cbfee30cd217b536)	2019-08-08 07:32:21 +00:00
Rafael David Tinoco via samba-technical	44b5168845	ctdb-scripts: Fix tcp_tw_recycle existence check net.ipv4.tcp_tw_recycle has been removed from Linux 4.12 but, still, makes sense to check its existence. Unfortunately, current check does not test for the procfs file existence. This commit fixes the issue. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13984 Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org> Autobuild-Date(master): Tue Jun 4 23:31:24 UTC 2019 on sn-devel-184 (cherry picked from commit 843fbb1207ee7ac84f3282974b66b9290d8da0ac)	2019-06-21 07:56:21 +00:00
Amitay Isaacs	8b52325985	ctdb-common: Fix memory leak in run_proc BUG: https://bugzilla.samba.org/show_bug.cgi?id=13943 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue May 14 08:59:03 UTC 2019 on sn-devel-184 (cherry picked from commit b1f4c86eea022999d5439e4a6ef3494fe41479b6) Autobuild-User(v4-9-test): Karolin Seeger <kseeger@samba.org> Autobuild-Date(v4-9-test): Fri May 17 10:56:19 UTC 2019 on sn-devel-144	2019-05-17 10:56:19 +00:00
Martin Schwenke	5419978537	ctdb-common: Fix memory leak BUG: https://bugzilla.samba.org/show_bug.cgi?id=13943 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 30bc6e2529cdd444d4ec7902844c3a6fb0858090)	2019-05-17 07:18:32 +00:00
Martin Schwenke	76c7302105	ctdb-recoverd: Fix memory leak state is always freed before exiting this function, so allocate fde off it instead of long-lived ctdb context. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13943 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 6a2941e2a9fd6ab2d5b8dbac042b61a7b1b0b914)	2019-05-17 07:18:32 +00:00
Andreas Schneider	925871f580	ctdb:common: Do not print NULL if we don't get a sockpath sock_socket_start_recv() might not fill sockpath if we return early. Found by GCC 9. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13937 Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> (cherry picked from commit 830cb7e67568de5f3ce359cb6af3be8ab545c824)	2019-05-17 07:18:31 +00:00
Martin Schwenke	1c2c081f43	ctdb-daemon: Never use 0 as a client ID ctdb_control_db_attach() and ctdb_control_db_detach() assume that any control with client ID 0 comes from another daemon and treat it specially. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13930 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 8663e0a64fbdb9ea16babbfe87d6f5d7a7b72bbd)	2019-05-17 07:18:30 +00:00
Martin Schwenke	24d70220b2	ctdb-tests: Fix logic error in simple ctdb reloadips test There is a chance that restoring IP addresses to the test node will result in different IP addresses being assigned to that node. Removing a single IP address may then fail (or be a no-op) if it is done after the restore. So, swap the single IP address removal to happen first, then restore, then remove all IP addresses. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit dc89db8ca6aadd4a9f7e8a85843c53709d04587c)	2019-05-17 07:18:30 +00:00
Martin Schwenke	9f679ba14d	ctdb-tests: Make ctdb reloadips tests more reliable ctdb reloadips will fail if it can't disable takover runs. The most likely reason for this is that there is already a takeover run in progress. We can't predict when this will happen, so retry if this occurs. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 8be4ee1a28d5c037955832b6f827d40f28f02796)	2019-05-17 07:18:30 +00:00
Martin Schwenke	0ffba5145c	ctdb-tests: Capture output in $out on failure as well BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit cf00db40355b49443263187f9d97934f91287e51)	2019-05-17 07:18:30 +00:00
Martin Schwenke	1eb5d2e4fc	ctdb-tests: Don't clean up test var directory in autotest target If the directory is always cleaned up then it is not possible to look at daemon logs to debug test failures. This target is only really used by autobuild.py, which (optionally) cleans up the parent directory anyway. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue May 7 06:56:01 UTC 2019 on sn-devel-184 (cherry picked from commit 5a9e338330fe136908a3a17a5df81c054c5cc5b0)	2019-05-17 07:18:30 +00:00
Martin Schwenke	15e5d62b3d	ctdb-tests: Fix usage message Since commit 0e9ead8f28fced3ebfa888786a1dc5bb59e734a3 daemons have been shut down after each test, so this option no longer has anything to do with killing daemons. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit a2ab6485e027ebb13871c7d83b7626ac5c9b98c0)	2019-05-17 07:18:30 +00:00
Martin Schwenke	814471f46e	ctdb-tests: Wait to allow database attach/detach to take effect Sometimes the detach test fails: Check detaching single test database detach_test1.tdb BAD: database detach_test1.tdb is still attached Number of databases:4 dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.0/db/volatile/detach_test4.tdb.0 dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.0/db/volatile/detach_test3.tdb.0 dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.0/db/volatile/detach_test2.tdb.0 dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.0/db/volatile/detach_test1.tdb.0 Number of databases:3 dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.1/db/volatile/detach_test4.tdb.1 dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.1/db/volatile/detach_test3.tdb.1 dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.1/db/volatile/detach_test2.tdb.1 Number of databases:4 dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.2/db/volatile/detach_test4.tdb.2 dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.2/db/volatile/detach_test3.tdb.2 dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.2/db/volatile/detach_test2.tdb.2 dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.2/db/volatile/detach_test1.tdb.2 *** TEST COMPLETED (RC=1) AT 2019-04-27 03:35:40, CLEANING UP... When issued from a client, the detach control re-broadcasts itself asynchronously to all nodes and then returns success. The controls to some nodes to do the actual detach may still be in flight when success is returned to the client. Therefore, the test should wait for a few seconds to allow the asynchronous controls to complete. The same is true for the attach control, so workaround the problem in the attach test too. An alternative is to make the attach and detach controls synchronous by avoiding the broadcast and waiting for the results of the individual controls sent to the nodes. However, a simple implementation would involve adding new nested event loops. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 3cb53a7a05409925024d6a67bcfaeb962d896e0b)	2019-05-17 07:18:30 +00:00
Martin Schwenke	3f104bd0db	ctdb-tests: Avoid bulk output in $out, prefer $outfile BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 066cc5b0c561464ed08890d9aa1a1a55b545e9cc)	2019-05-17 07:18:30 +00:00
Martin Schwenke	b594f5161d	ctdb-tests: Make try_command_on_node less error-prone This sometimes fails, apparently due to a cat process in onnode getting EAGAIN. The conclusion is that tests that process large amounts of output should not depend on a sub-shell delivering that output into a shell variable. Change try_command_on_node() to leave all of the output in file $outfile and just put the first 1KB into $out. $outfile is removed after each test completes. Change the implementation of sanity_check_output() to use $outfile instead of $out. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 9d02452a24625df5f62fd6d45a16effe2fa45fbe)	2019-05-17 07:18:29 +00:00
Martin Schwenke	7c97bc8328	ctdb-tests: Change sanity_check_output() to internally use $out All callers are currently passed $out. Global variable $out is used in many other places so use it here to simplify the interface and make future changes simpler. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 7c3819d1ac264acf998f426e0cef7f6211e0ddee)	2019-05-17 07:18:29 +00:00
Martin Schwenke	30b5d837d5	ctdb-tests: Extend test to cover ctdb rddumpmemory BUG: https://bugzilla.samba.org/show_bug.cgi?id=13923 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 8108b3134c017c22d245fc5b2207a88d44ab0dd2)	2019-05-17 07:18:29 +00:00
Martin Schwenke	08e229df43	ctdb-tools: Fix ctdb dumpmemory to avoid printing trailing NUL Fix ctdb rddumpmemory too. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13923 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit f78d9388fb459dc83fafb4da6e683e3137ad40e1)	2019-05-17 07:18:29 +00:00
Amitay Isaacs	945a41d384	ctdb-common: Avoid race between fd and signal events BUG: https://bugzilla.samba.org/show_bug.cgi?id=13895 In run_proc, there was an implicit assumption that when a process exits, fd event (pipe between parent and child) would be processed first and signal event (SIGCHLD for the child) would be processed later. However, that is not the case. SIGCHLD can be received asynchronously any time even when the pipe data has not fully been read. This causes run_proc to miss some of the output from child process in tests. When SIGCHLD is being processed, if the pipe between parent and child is still open, then do an explict read from the pipe to ensure we read any data still in the pipe before closing the pipe. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Apr 12 08:19:29 UTC 2019 on sn-devel-144 (cherry picked from commit 289201277cd983b27cdfd5376c607eab112b4082) Autobuild-User(v4-9-test): Karolin Seeger <kseeger@samba.org> Autobuild-Date(v4-9-test): Mon Apr 15 12:55:46 UTC 2019 on sn-devel-144	2019-04-15 12:55:46 +00:00
Martin Schwenke	d9c47cb86e	ctdb-daemon: Revert "We can not assume that just because we could complete a TCP handshake" We also can not assume that nodes can be marked as connected via only the keepalive mechanism. Keepalives are not sent to disconnected nodes so, in the absence of other packets (e.g. broadcasts), 2 nodes may never become marked as connected to each other. Revert to marking nodes as connected in the TCP transport code. If a connection is to a non(-operational) ctdbd then it will revert to disconnected after a short while and may actually flap. This should be rare. This reverts commit 66919db3d7ab1e091223faf515b183af8bfddc83. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13888 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (cherry picked from commit 38dc6d11a26c2e9a2cae7927321f2216ceb1c5ec)	2019-04-15 08:28:11 +00:00
Martin Schwenke	49fa08814e	ctdb-scripts: Update statd-callout to try several configuration files The alternative seems to be to try something via CTDB_NFS_CALLOUT. That would be complicated and seems like overkill for something this simple. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@samba.org> (cherry picked from commit a2bd4085896804ee2da811e17f18c78a5bf4e658)	2019-04-12 07:57:11 +00:00
Martin Schwenke	dae0e8ec96	ctdb-scripts: Allow load_system_config() to take multiple alternatives The situation for NFS config has got more complicated and is probably broken in statd-callout on Debian-like systems at the moment. Allow several alternative configuration names to be tried. Stop after the first that is found and loaded. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@samba.org> (cherry picked from commit 0d67ea5fcca766734ecc73ad6b0139f7c13a15c5)	2019-04-12 07:57:11 +00:00

1 2 3 4 5 ...

8088 Commits