samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-22 13:34:15 +03:00

Author	SHA1	Message	Date
Noel Power	0ff1b78fc2	ctdb-tcp: move free of inbound queue to TCP restart Since commit `77deaadca8`, a nodeA which had previously accepted a connection from nodeB (where nodeB dies e.g. as as result of fencing) when nodeB attempts to connect again after restarting is always rejected with ctdb_listen_event: Incoming queue active, rejecting connection from w.x.y.z messages. Consolidate dead node handling in the TCP restart handling. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295 Signed-off-by: Noel Power <noel.power@suse.com> Reviewed-by: Ralph Boehme <slow@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2020-03-12 03:47:30 +00:00
Martin Schwenke	e45feaf28d	ctdb-tcp: Simplify freeing of transport data on shutdown The type-checking is superfluous and gets in the way of readability. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Thu Nov 14 03:45:44 UTC 2019 on sn-devel-184	2019-11-14 03:45:44 +00:00
Martin Schwenke	750f3938e4	ctdb-daemon: Rename ctdb_context private_data to transport_data This gives a casual reader a useful clue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-11-14 02:20:46 +00:00
Martin Schwenke	53f8492caa	ctdb-daemon: Rename ctdb_node private_data to transport_data This gives a casual reader a useful clue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-11-14 02:20:46 +00:00
Volker Lendecke	a6d99d9e5c	ctdb-tcp: Close inflight connecting TCP sockets after fork Commit `c68b6f96f2` changed the talloc hierarchy such that outgoing TCP sockets while sitting in the async connect() syscall are not freed via ctdb_tcp_shutdown() anymore, they are hanging off a longer-running structure. Free this structure as well. If an outgoing TCP socket leaks into a long-running child process (possibly the recovery daemon), this connection will never be closed as seen by the destination node. Because with recent changes incoming connections will not be accepted as long as any incoming connection is alive, with that socket leak into the recovery daemon we will never again be able to successfully connect to the node that is affected by this leak. Further attempts to connect will be discarded by the destination as long as the recovery daemon keeps this socket alive. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175 RN: Avoid communication breakdown on node reconnect Signed-off-by: Martin Schwenke <martin@meltin.net> Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-11-14 02:20:46 +00:00
Martin Schwenke	bf47bc18bb	ctdb-tcp: Drop tracking of file descriptor for incoming connections This file descriptor is owned by the incoming queue. It will be closed when the queue is torn down. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175 RN: Avoid communication breakdown on node reconnect Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-11-06 01:22:30 +00:00
Martin Schwenke	7f4854d964	ctdb-tcp: Create outbound queue when the connection becomes writable Since commit `ddd97553f0` ctdb_queue_send() doesn't queue a packet if the connection isn't yet established (i.e. when fd == -1). So, don't bother creating the outbound queue during initialisation but create it when the connection becomes writable. Now the presence of the queue indicates that the outbound connection is up. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-08-16 21:30:35 +00:00
Martin Schwenke	c68b6f96f2	ctdb-tcp: Move incoming fd and queue into struct ctdb_tcp_node This makes it easy to track both incoming and outgoing connectivity states. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-08-16 21:30:35 +00:00
Martin Schwenke	c06620169f	ctdb-tcp: Rename fd -> out_fd in_fd is coming soon. Fix coding style violations in the affected and adjacent lines. Modernise some debug macros and make them more consistent (e.g. drop logging of errno when strerror(errno) is already logged. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-08-16 21:30:35 +00:00
Martin Schwenke	888ecc74ed	ctdb-tcp: Fix signed/unsigned comparisons by declaring as unsigned Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:23 +00:00
Amitay Isaacs	921d815da0	ctdb-transport: Replace ctdb_logging.h with common/logging.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>	2015-11-16 00:46:15 +01:00
Amitay Isaacs	4647787773	ctdb-daemon: Separate prototypes for common client/server functions This groups function prototypes for common client/server functions in common/common.h and removes them from ctdb_private.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Amitay Isaacs	01c6c90e98	ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Amitay Isaacs	2fdb332fad	ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Martin Schwenke	a5be2c245d	ctdb-daemon: Store node addresses as ctdb_sock_addr rather than strings Every time a nodemap is contructed the node IP addresses all need to be parsed. This isn't very productive use of CPU. Instead, parse each string once when the nodes file is loaded. This results in much simpler code. This code also removes the use of ctdb_address. Duplicating the port is pointless without an abstraction layer around ctdb_address. If CTDB gets an incompatible transport in the future then add an abstraction layer. Note that the infiniband code is not updated. Compilation of the infiniband code is already broken. Fixing it will be a separate, properly tested effort. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Mathieu Parent	d82b9ae410	build: Fix tdb.h path to enable building with system TDB library (This used to be ctdb commit f8bf99de3a5f56be67aaa67ed836458b1cf73e86)	2013-06-14 16:45:27 +10:00
Amitay Isaacs	4392591555	Remove explicit include of lib/tevent/tevent.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 0681014ca5ed2a9b56f63fdace7f894beccf8a9a)	2012-04-13 17:28:14 +10:00
Rusty Russell	f93440c4b7	event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version `7f29f817fa`. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)	2010-08-18 09:16:31 +09:30
Rusty Russell	7061ceffd8	Report client for queue errors. We've been seeing "Invalid packet of length 0" errors, but we don't know what is sending them. Add a name for each queue, and print nread. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit e6cf0e8f14f4263fbd8b995418909199924827e9)	2010-07-01 23:08:49 +10:00
Ronnie Sahlberg	e6170b5389	add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343)	2009-06-01 14:18:34 +10:00
Ronnie Sahlberg	7265c713db	we need to set the port properly in the parse_ip helper (This used to be ctdb commit 43fe18d86995744ba61c7a6405b70edcb265930a)	2009-03-24 13:45:11 +11:00
Ronnie Sahlberg	edb7241c05	redesign how reloadnodes is implemented. modify the transport methods to allow to restart individual connections and set up destructors properly. only tear down/set-up tcp connections to nodes removed from the cluster or nodes added to the cluster. Leave tcp connections to unchanged nodes connected. make "ctdb reloadnodes" explicitely cause a recovery of the cluster once the files have been realoaded (This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b)	2008-12-02 13:26:30 +11:00
Ronnie Sahlberg	1778280d50	When we reload the nodes file instead of shutting down/restarting the entire tcp layer just bounce all outgoing connections and reconnect (This used to be ctdb commit e701a531868149f16561011e65794a4a46ee6596)	2008-10-07 18:12:54 +11:00
root	fb5cc54206	listen_fd is auto-closed Closing it here just causes an epoll error, and may close a fd in use by another structure to be closed. This caused a infinite recovery loop (This used to be ctdb commit bc251ac7029c2689776a8c31b28ac1d233d52d4f)	2008-05-08 17:14:00 +10:00
Ronnie Sahlberg	9f99b44fd1	to make it easier/less disruptive to add nodes to a running cluster add a new control that causes the node to drop the current nodes list and reread it from the nodes file. During this operation, the node will also drop the tcp layer and restart it. When we drop the tcp layer, by talloc_free()ing the ctcp structure add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer add two new commands for the ctdb tool. one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file (This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)	2008-02-19 14:44:48 +11:00
Andrew Tridgell	f6e53f433b	merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)	2008-02-04 20:07:15 +11:00
Andrew Tridgell	8e22bca5ca	fixed a double close of a socket, leading to an EPOLL error (This used to be ctdb commit bbe8ad842bdfedd37ef14a6be07ad939113fe9b1)	2007-10-22 16:41:11 +10:00
Andrew Tridgell	f47f758fe8	merge from ronnie (This used to be ctdb commit d444fdc7782496abe4b27003b647ac49fb52e6be)	2007-10-19 09:39:07 +10:00
Ronnie Sahlberg	d1ba047b7f	add a new transport method so that when a node is marked as dead, we shut down and restart the transport othervise, if we use the tcp transport the tcp connection might try to retransmit the queued data during the time the node is unavailable. this together with the exponential backoff for tcp means that the tcp connection quickly reaches the maximum backoff rto which is often 60 or 120 seconds. this would mean that it could take up to 60/120 seconds before the tcp layer detects that the connection is dead and it has to be reestablished. (This used to be ctdb commit 0256db470879ce556b0f00070f7ebeaf37e529ab)	2007-10-19 08:58:30 +10:00
Andrew Tridgell	32de198fd3	update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)	2007-07-10 15:29:31 +10:00
Ronnie Sahlberg	027d40a5ee	rename tnode->queue to tnode->out_queue to indicate this queue is for sending data out to the other node (This used to be ctdb commit 0bc949c529094570da56c9007ff96b1f5ad02c59)	2007-07-02 14:26:50 +10:00
Andrew Tridgell	be3a00bd73	clean out some more cruft (This used to be ctdb commit ad16c5fe2748b48a6f6c79976359d56d9bed33f4)	2007-06-05 17:57:07 +10:00
Andrew Tridgell	5e5701a7b8	- make calling of recovered event script async - shutdown sockets before calling shutdown script (This used to be ctdb commit c5e099feef94a014a77742b6cc1d0afe78ef9da9)	2007-06-02 08:41:19 +10:00
Andrew Tridgell	bf3b740a1b	ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960)	2007-05-31 13:50:53 +10:00
Andrew Tridgell	c833b06a35	we need to listen at transport initialise stage to find our own node number (This used to be ctdb commit 4a9455dfbe95e53884b46ad26dba0c33e3432ba9)	2007-05-30 14:46:14 +10:00
Andrew Tridgell	8ed48aac51	don't start the transport connecting to the other nodes until after the startup event script has run (This used to be ctdb commit afca3cc74211aa2e18b1f74d36b2add8dffcfdc7)	2007-05-30 13:26:50 +10:00
Andrew Tridgell	c23d1694db	merge from peter (This used to be ctdb commit ddf390da2bceb5b3f431433aec424d99d98c05f4)	2007-04-26 15:28:13 +02:00
Andrew Tridgell	273a3944a8	- added a --torture option to all ctdb tools. This sets CTDB_FLAG_TORTURE, which forces some race conditions to be much more likely. For example a 20% chance of not getting the lock on the first try in the daemon - abstraced the ctdb_ltdb_lock_fetch_requeue() code to allow it to work with both inter-node packets and client->daemon packets - fixed a bug left over in ctdb_call from when the client updated the header on a call reply - removed CTDB_FLAG_CONNECT_WAIT flag (not needed any more) (This used to be ctdb commit 7559dcd184666c3853127e3c8f5baef4fea327c4)	2007-04-19 16:27:56 +10:00
Andrew Tridgell	b79e29c779	- make he packet allocation routines take a mem_ctx, which allows us to put memory directly in the right context, avoiding quite a few talloc_steal calls, and simplifying the code - make the fetch lock code in the daemon fully async (This used to be ctdb commit d98b4b4fcadad614861c0d44a3854d97b01d0f74)	2007-04-19 10:37:44 +10:00
Volker Lendecke	d8dd8fbe49	Rename "private" to "private_data" (This used to be ctdb commit 78cf4443ac0c66fb750ef6919bcdec189ac219c9)	2007-04-11 20:12:15 +02:00
Andrew Tridgell	902967249c	fix the queueing for partially connected tcp sockets (This used to be ctdb commit 55f1c2442a53a547302669a4fdd0f1c1deeed930)	2007-04-10 20:48:31 +10:00
Andrew Tridgell	5861917468	make some functions static, and remove an unused structure (This used to be ctdb commit 8d09cac96b2c604a68e4903346cc9db3a66d80da)	2007-04-10 19:40:29 +10:00
Ronnie sahlberg	91c39b4852	move the checking of the CONNECT_WAIT flag into the start method for tcp (This used to be ctdb commit 44f3e4456d931af642192e034f84c961ab1fdcf0)	2007-04-10 12:39:25 +10:00
Andrew Tridgell	16d2ca6fa0	merge fixes from samba4 (This used to be ctdb commit fb90a5424348d0b6ed9a1b8da4ceadcc4d1a1cb1)	2007-01-23 11:38:45 +11:00
Andrew Tridgell	a3f91ddf57	enforce the tcp memory alignment in packet queue (This used to be ctdb commit 222f53a3205509a45fbc3267297521df22a414ec)	2006-12-19 12:07:07 +11:00
Andrew Tridgell	3c097c9a5f	added handling of partial packet reads added transport level packet allocator, allowing the transport to enforce alignment or special memory rules (This used to be ctdb commit 50304a5c4d8d640732678eeed793857334ca5ec1)	2006-12-19 12:03:10 +11:00
Andrew Tridgell	ec5d2ddd8e	- added ctdb_set_flags() call - added --self-connect option to ctdb_test, allowing testing when a node connects to itself. not as efficient as local bypass, but very useful for testing purposes (easier to work with 1 task in gdb than 2) - split the ctdb_call() into an async triple, in the style of Samba4 async functions. So we now have ctdb_call_send(), ctdb_call_recv() and ctdb_call(). - added the main ctdb_call protocol logic. No error checking yet, but seems to work for simple cases - ensure we initialise the length argument to getsockopt() (This used to be ctdb commit 95fad717ef5ab93be3603aa11d2878876fe868d3)	2006-12-01 15:45:24 +11:00
Andrew Tridgell	fdb317facf	- added simple (fake) vnn system - split up ctdb layer code into 3 modules - added a simple test suite - added packet structures for ctdb_call - switched to an array for ctdb_node to make vnn lookup easy and fast (This used to be ctdb commit 8a17460a816a5970f2df8244a06aec55d814f186)	2006-11-28 17:56:10 +11:00
Andrew Tridgell	5d0ba69e06	- setup a convenience name field for nodes - added basic IO handling for the tcp backend - added a ctdb_node_dead upcall - added packet queueing - adding incoming packet handling (This used to be ctdb commit 415497c952630e746e8cdcf8e1e2a7b2ac3e51fb)	2006-11-28 14:15:46 +11:00
Andrew Tridgell	5b06e73fb1	- split up tcp functions into more logical parts - added upcall methods from transport to ctdb layer (This used to be ctdb commit 59f0dab652000f1c755e59567b03cf84dad7e954)	2006-11-28 11:51:33 +11:00

50 Commits