1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-24 21:34:56 +03:00
Commit Graph

109 Commits

Author SHA1 Message Date
Ronnie Sahlberg
9f99b44fd1 to make it easier/less disruptive to add nodes to a running cluster
add a new control that causes the node to drop the current nodes list
and reread it from the nodes file.
During this operation, the node will also drop the tcp layer and restart it.

When we drop the tcp layer, by talloc_free()ing the ctcp structure
add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer

add two new commands for the ctdb tool.
one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file

(This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)
2008-02-19 14:44:48 +11:00
Ronnie Sahlberg
87b38e01b2 the ctdb structure must make its own copy of the ->address field and not just
copy the content of the nodes structure.

this ctdb_address structure contains a pointer which is talloced hanging off the structure itself.
If we copy the content of this structure as we did in assigning to ctdb->address from nodes[i]
then if we talloc_free() the node structure we end up with a wild pointer in ctdb->address

(This used to be ctdb commit 644a7248548260d37df432979b129797750907f4)
2008-02-19 14:35:15 +11:00
Andrew Tridgell
f6e53f433b merge from ronnie
(This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)
2008-02-04 20:07:15 +11:00
Andrew Tridgell
9d6ac0cf55 added debug constants to allow for better mapping to syslog levels
(This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502)
2008-02-04 17:44:24 +11:00
Ronnie Sahlberg
9e73dc87cc Add a --node-ip argument so that one can specify which ip address a
specific instance of ctdbd should bind to. This helps when running a
"virtual" cluster on a single machine where all instcances bind to 
different alias interfaces.

If --node-ip is specified, then we will only try to bind to this ip 
address only. Othervise we fall back to the original method trying the
ip addresses in /etc/ctdb/nodes one by one until we find one we can bind 
to.

No variable in /etc/sysconfig/ctdb added since this parameter only makes 
sense in a virtual test/debug cluster.

(This used to be ctdb commit d96cb02c2c24f9eabbc53d3d38e90dea49cff3e0)
2007-11-26 10:52:55 +11:00
Andrew Tridgell
8e22bca5ca fixed a double close of a socket, leading to an EPOLL error
(This used to be ctdb commit bbe8ad842bdfedd37ef14a6be07ad939113fe9b1)
2007-10-22 16:41:11 +10:00
Andrew Tridgell
2d8afd85d5 another place where we need to mark connect_fde as freed
(This used to be ctdb commit d047fbeafebe4b150602f9a91802795659058b16)
2007-10-22 15:13:32 +10:00
Andrew Tridgell
f09537e7f1 prevent a double free
(This used to be ctdb commit 5a1b923abb36c6deb99ae178fdd54f12235dc309)
2007-10-22 14:07:35 +10:00
Andrew Tridgell
f47f758fe8 merge from ronnie
(This used to be ctdb commit d444fdc7782496abe4b27003b647ac49fb52e6be)
2007-10-19 09:39:07 +10:00
Ronnie Sahlberg
d1ba047b7f add a new transport method so that when a node is marked as dead, we
shut down and restart the transport

othervise, if we use the tcp transport the tcp connection might try to 
retransmit the queued data during the time the node is unavailable.
this together with the exponential backoff for tcp means that the tcp 
connection quickly reaches the maximum backoff rto which is often 60 or 
120 seconds.   this would mean that it could take up to 60/120 seconds 
before the tcp layer detects that the connection is dead and it has to 
be reestablished.

(This used to be ctdb commit 0256db470879ce556b0f00070f7ebeaf37e529ab)
2007-10-19 08:58:30 +10:00
Ronnie Sahlberg
eb4cf6a686 change ctdb->vnn to ctdb->pnn
(This used to be ctdb commit 8c776e5707e503ec6586aae39ac6b3ea5a2fd2bc)
2007-09-04 10:06:36 +10:00
Ronnie Sahlberg
12ebb74838 change how we do public addresses and takeover so that we can have
multiple public addresses spread across multiple interfaces on each 
node.

this is a massive patch since we have previously made the assumtion that 
we only have one public address per node.

get rid of the public_interface argument.  the public addresses file 
now explicitely lists which interface the address belongs to

(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
2007-09-04 09:50:07 +10:00
Andrew Tridgell
32de198fd3 update lib/replace from samba4
(This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)
2007-07-10 15:29:31 +10:00
Ronnie Sahlberg
027d40a5ee rename tnode->queue to tnode->out_queue to indicate this queue is for
sending data out to the other node

(This used to be ctdb commit 0bc949c529094570da56c9007ff96b1f5ad02c59)
2007-07-02 14:26:50 +10:00
Ronnie Sahlberg
3a71dcf505 when accepting an incoming connection, verify that the source address is
from one of the configured nodes and reject the connection othervise

(This used to be ctdb commit ef290a6340eb1a1c0ae60c74b38c93396e388f73)
2007-07-02 14:10:20 +10:00
Andrew Tridgell
2ed57a9ae1 implement a scheme where nodes are banned if they continuously caused the cluster
to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes)

(This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c)
2007-06-07 15:18:55 +10:00
Andrew Tridgell
be3a00bd73 clean out some more cruft
(This used to be ctdb commit ad16c5fe2748b48a6f6c79976359d56d9bed33f4)
2007-06-05 17:57:07 +10:00
Andrew Tridgell
5e5701a7b8 - make calling of recovered event script async
- shutdown sockets before calling shutdown script

(This used to be ctdb commit c5e099feef94a014a77742b6cc1d0afe78ef9da9)
2007-06-02 08:41:19 +10:00
Andrew Tridgell
bf3b740a1b ctdb is GPL not LGPL
(This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960)
2007-05-31 13:50:53 +10:00
Andrew Tridgell
1e72af9c51 close sockets when we exec scripts
(This used to be ctdb commit 0fac2164db4279db2d7d376a34be05b890304087)
2007-05-30 15:43:25 +10:00
Andrew Tridgell
c833b06a35 we need to listen at transport initialise stage to find our own node number
(This used to be ctdb commit 4a9455dfbe95e53884b46ad26dba0c33e3432ba9)
2007-05-30 14:46:14 +10:00
Andrew Tridgell
8ed48aac51 don't start the transport connecting to the other nodes until after the startup event script has run
(This used to be ctdb commit afca3cc74211aa2e18b1f74d36b2add8dffcfdc7)
2007-05-30 13:26:50 +10:00
Andrew Tridgell
1140d5a20a fixed more warnings on 64 bit boxes
(This used to be ctdb commit 2f6eae476203f8a8b28e083553204c01f224c8a5)
2007-05-29 13:58:41 +10:00
Andrew Tridgell
7ff6e17ca1 removed bogus alignment check
(This used to be ctdb commit 93fd5fd01dc61a53a91e319d5cbbe0fc8f740717)
2007-05-26 18:13:19 +10:00
Andrew Tridgell
47b20f7e26 show op type of badly aligned packets in tcp layer
(This used to be ctdb commit 6a3e1faa2ce77ee021154d66aeaa99c51bbc8b06)
2007-05-26 16:35:41 +10:00
Andrew Tridgell
9aa692669b paranoid checks for bad packets in tcp layer. Close the socket if it gets a bad packet
(This used to be ctdb commit 1277089e5c6e1036517c63ee8c8e4ff98cb76cf8)
2007-05-26 16:32:32 +10:00
Andrew Tridgell
07ade57802 make sure we find out about new nodes as fast as possible
(This used to be ctdb commit 73f2c77166e2053625d0f76c370cf7e789a63fdf)
2007-05-25 22:07:45 +10:00
Andrew Tridgell
20d96ad5c5 enable TCP keepalives
(This used to be ctdb commit a44f760f6260359201d8431d2f1267af2bc6b1b1)
2007-05-15 18:40:56 +10:00
Andrew Tridgell
527b2352ac fixed two more places where we don't correctly handle write errors on sockets
(This used to be ctdb commit f4a71bb63e7f75d21b66f9eaeac997c2029cd146)
2007-05-15 14:08:58 +10:00
Andrew Tridgell
67f5601bef fixed a fd close error on reconnect
(This used to be ctdb commit 240651a6f67f914b06e273696cef6180d788221e)
2007-05-15 10:33:28 +10:00
Andrew Tridgell
7d3870d41f AIX needs sin_len field for bind()
(This used to be ctdb commit cd6c35d4aa4f4a4cfeedf6902cda84e43d7aeba4)
2007-05-15 09:42:52 +10:00
Andrew Tridgell
2dc24c7d56 added a hopcount in ctdb_call
(This used to be ctdb commit 36d838801a2a2008c50322cdbfff65a308b1cd1a)
2007-05-01 13:25:02 +10:00
Andrew Tridgell
5b8c4bba5a auto-determine listen address by attempting to bind to each address in the cluster in turn
(This used to be ctdb commit 2fab9f96df2e5b5c51c860fd65caf0e926a63e34)
2007-05-01 06:34:55 +10:00
Andrew Tridgell
ee228e870d fixed some warnings
(This used to be ctdb commit b5434a40cf2db008eb1e681fcd2ceeff331324fa)
2007-04-28 11:35:49 +02:00
Andrew Tridgell
c23d1694db merge from peter
(This used to be ctdb commit ddf390da2bceb5b3f431433aec424d99d98c05f4)
2007-04-26 15:28:13 +02:00
Andrew Tridgell
273a3944a8 - added a --torture option to all ctdb tools. This sets
CTDB_FLAG_TORTURE, which forces some race conditions to be much more
  likely. For example a 20% chance of not getting the lock on the
  first try in the daemon

- abstraced the ctdb_ltdb_lock_fetch_requeue() code to allow it to
  work with both inter-node packets and client->daemon packets

- fixed a bug left over in ctdb_call from when the client updated the
  header on a call reply

- removed CTDB_FLAG_CONNECT_WAIT flag (not needed any more)

(This used to be ctdb commit 7559dcd184666c3853127e3c8f5baef4fea327c4)
2007-04-19 16:27:56 +10:00
Andrew Tridgell
b79e29c779 - make he packet allocation routines take a mem_ctx, which allows
us to put memory directly in the right context, avoiding quite a few
  talloc_steal calls, and simplifying the code

- make the fetch lock code in the daemon fully async

(This used to be ctdb commit d98b4b4fcadad614861c0d44a3854d97b01d0f74)
2007-04-19 10:37:44 +10:00
Andrew Tridgell
fb84d56b1b make sure we notify ctdb when a node dies
(This used to be ctdb commit 598feb4fb9badcf329837965ad39e0f0dfe28498)
2007-04-17 19:41:29 +10:00
Andrew Tridgell
65cdf2297a private -> private_data for samba3
(This used to be ctdb commit 080b6901173afb2ad618dd0621876ff478c7d6e5)
2007-04-13 20:38:24 +10:00
Volker Lendecke
d8dd8fbe49 Rename "private" to "private_data"
(This used to be ctdb commit 78cf4443ac0c66fb750ef6919bcdec189ac219c9)
2007-04-11 20:12:15 +02:00
Andrew Tridgell
902967249c fix the queueing for partially connected tcp sockets
(This used to be ctdb commit 55f1c2442a53a547302669a4fdd0f1c1deeed930)
2007-04-10 20:48:31 +10:00
Andrew Tridgell
5861917468 make some functions static, and remove an unused structure
(This used to be ctdb commit 8d09cac96b2c604a68e4903346cc9db3a66d80da)
2007-04-10 19:40:29 +10:00
Andrew Tridgell
f1e0174e83 made all sockets handle partial IO
abstract IO via ctdb_queue_*() functions

(This used to be ctdb commit 636ae76f4632b29231db87be32c9114f58b37840)
2007-04-10 19:33:21 +10:00
Ronnie sahlberg
f2e2d1c2f3 change the tcp code to call ctdb_read_pdu() instead of doing the partial read thing explicitely
(This used to be ctdb commit 6156bec0187df27578afd5afa3fcaadb1a202030)
2007-04-10 13:17:15 +10:00
Ronnie sahlberg
91c39b4852 move the checking of the CONNECT_WAIT flag into the start method for tcp
(This used to be ctdb commit 44f3e4456d931af642192e034f84c961ab1fdcf0)
2007-04-10 12:39:25 +10:00
Ronnie sahlberg
a25554be50 When we create a tcp connection to a remote ctdb node do an explicit bind() to set our source side to the same ip address as we use to listen to ctdb traffic.
We need this since there is no guarantee that INADDR_ANY (which would be defaulted to if we dont bind) would be routable from the remote host.
This is entirely possible to happen since CTDB traffic is likely to be isolated to a private non-routable network.

(This used to be ctdb commit e0743d2f84ca0088734c912e210deb93a6b78860)
2007-04-06 09:08:41 +10:00
Andrew Tridgell
f49c93f96b added --num-msgs option
added TCP_NODELAY on tcp sockets

(This used to be ctdb commit fa76cff388237adea98c2be0827c54334080256a)
2007-02-20 14:57:13 +11:00
Andrew Tridgell
ed6d9d0606 support hostnames for node names
(This used to be ctdb commit 5c45b51ec42cdbadce7870b47b765a79d8d41b8b)
2007-02-20 13:22:18 +11:00
Andrew Tridgell
979ef2832a merged from samba4 ctdb
(This used to be ctdb commit 677fd2a7758b743ea920d0b3adb85fbb3f1ff49e)
2007-02-07 13:26:07 +11:00
Andrew Tridgell
16d2ca6fa0 merge fixes from samba4
(This used to be ctdb commit fb90a5424348d0b6ed9a1b8da4ceadcc4d1a1cb1)
2007-01-23 11:38:45 +11:00
Andrew Tridgell
6dbaa5abfc simple ctdb benchmark
(This used to be ctdb commit eb80fd212472fe3b111dabe7adf6dd507fe3656a)
2006-12-19 16:27:03 +11:00
Andrew Tridgell
a3f91ddf57 enforce the tcp memory alignment in packet queue
(This used to be ctdb commit 222f53a3205509a45fbc3267297521df22a414ec)
2006-12-19 12:07:07 +11:00
Andrew Tridgell
3c097c9a5f added handling of partial packet reads
added transport level packet allocator, allowing the transport to
enforce alignment or special memory rules

(This used to be ctdb commit 50304a5c4d8d640732678eeed793857334ca5ec1)
2006-12-19 12:03:10 +11:00
Andrew Tridgell
35a627cc32 queue up packets to nodes that aren't connected yet. This avoids a
startup race condition in the test suite

(This used to be ctdb commit b623ac755de843a3386a7c0e882d651b7f20d482)
2006-12-01 15:54:15 +11:00
Andrew Tridgell
ec5d2ddd8e - added ctdb_set_flags() call
- added --self-connect option to ctdb_test, allowing testing when a
  node connects to itself. not as efficient as local bypass, but very
  useful for testing purposes (easier to work with 1 task in gdb than
  2)

- split the ctdb_call() into an async triple, in the style of Samba4
  async functions. So we now have ctdb_call_send(), ctdb_call_recv()
  and ctdb_call().

- added the main ctdb_call protocol logic. No error checking yet, but
  seems to work for simple cases

- ensure we initialise the length argument to getsockopt()

(This used to be ctdb commit 95fad717ef5ab93be3603aa11d2878876fe868d3)
2006-12-01 15:45:24 +11:00
Andrew Tridgell
fdb317facf - added simple (fake) vnn system
- split up ctdb layer code into 3 modules

- added a simple test suite

- added packet structures for ctdb_call

- switched to an array for ctdb_node to make vnn lookup easy and fast

(This used to be ctdb commit 8a17460a816a5970f2df8244a06aec55d814f186)
2006-11-28 17:56:10 +11:00
Andrew Tridgell
5d0ba69e06 - setup a convenience name field for nodes
- added basic IO handling for the tcp backend

- added a ctdb_node_dead upcall

- added packet queueing

- adding incoming packet handling

(This used to be ctdb commit 415497c952630e746e8cdcf8e1e2a7b2ac3e51fb)
2006-11-28 14:15:46 +11:00
Andrew Tridgell
5b06e73fb1 - split up tcp functions into more logical parts
- added upcall methods from transport to ctdb layer

(This used to be ctdb commit 59f0dab652000f1c755e59567b03cf84dad7e954)
2006-11-28 11:51:33 +11:00
Andrew Tridgell
749a6b4c3a started splitting out transport code
(This used to be ctdb commit 3b75ef65bd0bff9c6366aba5a26b90be509fa77b)
2006-11-27 21:38:13 +11:00