samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-24 21:34:56 +03:00

Author	SHA1	Message	Date
Ronnie Sahlberg	edb7241c05	redesign how reloadnodes is implemented. modify the transport methods to allow to restart individual connections and set up destructors properly. only tear down/set-up tcp connections to nodes removed from the cluster or nodes added to the cluster. Leave tcp connections to unchanged nodes connected. make "ctdb reloadnodes" explicitely cause a recovery of the cluster once the files have been realoaded (This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b)	2008-12-02 13:26:30 +11:00
Ronnie Sahlberg	a782bdbacd	inew version 1.0.66 ddwq (This used to be ctdb commit 499a01fece2a5f24f1b2943cf3dc6e9a3a8ca3b5)	2008-11-24 19:06:02 +11:00
Ronnie Sahlberg	1e2831898c	allow to change the recmaster even the database is not frozen (This used to be ctdb commit 03e2e436db5cfd29a56d13f5d2101e42389bfc94)	2008-11-21 16:24:12 +11:00
Andrew Tridgell	59b6a9a9e6	fixed problem with looping ctdb recoveries After a node failure, GPFS can get into a state where non-blocking fcntl() locks can take a long time. This means to the ctdb set_recmode test timing out, which leads to a recovery failure, and a new recovery. The recovery loop can last a long time. The fix is to consider a fcntl timeout as a success of this test. The test is to see that we can't lock the shared reclock file, so a timeout is fine for a success. (This used to be ctdb commit 6579a6a2a7161214adedf0f67dce62f4a4ad1afe)	2008-11-21 10:24:13 +11:00
Ronnie Sahlberg	331b9bdb5f	dont override/change CTDB_BASE if it is already set by the shell (This used to be ctdb commit 0a6f9326cb99f14b5c9edd0d8854d8229df49910)	2008-11-20 16:39:56 +11:00
Ronnie Sahlberg	a2a5904f66	Keepalive packets were only sent every KeepaliveInterval if the socket had been completely idle during that interval. If we had been sending other packets such as Messages, Calls or Controls there wouldnt be any need for an explicit keepalive and thus we didnt send one. This does make it somewhat awkward when analyzing traces since it is non-intuitive when keepalives are sent and when they are not sent. Change the keepalive logic to always send a keepalive regardless of whether the link is idle or not. (This used to be ctdb commit 7a18f33ec7512100dd067c65f0470889ff8fd591)	2008-11-20 13:35:08 +11:00
Ronnie Sahlberg	94a56ea410	reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa)	2008-11-20 12:43:18 +11:00
Ronnie Sahlberg	06728fdac9	we actually need a ctdb_db variable (This used to be ctdb commit aba984f1b85f5a2d370b093061cf15843ee53758)	2008-11-03 21:54:52 +11:00
Ronnie Sahlberg	d7007793ea	latency is measured in us, not ms use an explicit ctdb_db variable instead of dereferencing state (This used to be ctdb commit 8c6a02fb423a8cbcbfc706767e3d353cd48073c3)	2008-10-30 13:34:10 +11:00
Ronnie Sahlberg	e1b0cea427	add control and logging of very high latencies. log the type of operation and the database name for all latencies higher than a treshold (This used to be ctdb commit 1d581dcd507e8e13d7ae085ff4d6a9f3e2aaeba5)	2008-10-30 12:49:53 +11:00
Ronnie Sahlberg	b9bd20ce55	add a context and a timed event so that once we have been in recovery mode for too long we drop all public ip addresses (This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0)	2008-10-22 11:04:41 +11:00
Ronnie Sahlberg	beed899c4f	null out the pointer before we reload the nodes file (This used to be ctdb commit 4b0f32047e8bece0a052bdbe2209afe91b7e8ce3)	2008-10-17 21:38:42 +11:00
Ronnie Sahlberg	a924ef78b6	when we reload the nodes file, we may need to reload the nodes file inside the recovery daemon as well. (This used to be ctdb commit 82fd2b6b5cd8e988c38fa6b74121a048757bdeef)	2008-10-17 21:18:06 +11:00
Ronnie Sahlberg	ce66008e08	specify a "script log level" on the commandline to set under which log level any/all output from eventscripts will be logged as (This used to be ctdb commit cdc79d4f22f1a6aec5c34115969421f93663932a)	2008-10-17 07:56:12 +11:00
Ronnie Sahlberg	5808a7be96	allow multiple eventscripts using the same prefix. this eases the pain for users that use out of tree eventscripts (This used to be ctdb commit 8313dfb6fc5404cd2d065af6620412f8664ada11)	2008-10-16 17:57:50 +11:00
Ronnie Sahlberg	233b0e5cbb	lower the loglevel for the informational message that a TCP_ADD opeation described an ip address not known to be a public address. This could happen if someone for genuine reasons accesses a share through a static ip address. It can also happen if non homogenous public address configurations are used and when a tcp description is pushed out to a different node that does not server/know the specific ip address. (This used to be ctdb commit 9b1d089c99413f3681440f3cf33c293d118c9108)	2008-10-15 03:02:09 +11:00
Ronnie Sahlberg	41d19e650c	Revert "from Mathieu Parent <math.parent@gmail.com>" This reverts commit dc9cd4779db4a89697731e4cf415be51067a07c1. Conflicts: (This used to be ctdb commit d13da2e8fe2fab619540525d98a5502a23ab7d20)	2008-10-15 01:08:29 +11:00
Ronnie Sahlberg	cb300382b0	update TAKEIP/RELEASEIP/GETPUBLICIP/GETNODEMAP controls so we retain an older ipv4-only version of these controls. We need this so that we are backwardcompatible with old versions of ctdb and so that we can interoperate with a ipv4-only recmaster during a rolling upgrade. (This used to be ctdb commit 6b76c520f97127099bd9fbaa0fa7af1c61947fb7)	2008-10-14 10:40:29 +11:00
Ronnie Sahlberg	e5a3a73e64	from Mathieu Parent <math.parent@gmail.com> Hi, I have attached a patch necessary as debian log dir (/var/log) is not a subdir of VARDIR (/var/lib on rpm systems, /var/lib/ctdb on debian). As I don't know much about autotools and friends, this patch may be hacky. This is part of the process to minimize diff between distributions. (This used to be ctdb commit dc9cd4779db4a89697731e4cf415be51067a07c1)	2008-10-13 08:27:33 +11:00
Ronnie Sahlberg	3411e98e14	skip empty lines in the public addresses file, not skip all non-empty lines (This used to be ctdb commit dc108adada33bb713f71a2859eda3b439ed0cd1a)	2008-10-07 19:34:34 +11:00
Ronnie Sahlberg	374906860c	from Michael Adams : allow #-style comments in the nodes and public addresses file (This used to be ctdb commit 5f96b33a379c80ed8a39de1ee41f254cf48733f9)	2008-10-07 19:25:10 +11:00
Ronnie Sahlberg	46187433ca	remove an unused variable (This used to be ctdb commit 4237bd3753dcb024c17461e974414bef1b609416)	2008-10-07 18:14:44 +11:00
Ronnie Sahlberg	1778280d50	When we reload the nodes file instead of shutting down/restarting the entire tcp layer just bounce all outgoing connections and reconnect (This used to be ctdb commit e701a531868149f16561011e65794a4a46ee6596)	2008-10-07 18:12:54 +11:00
Ronnie Sahlberg	3e274e5f8c	use the correct tunable failcount not timeout (This used to be ctdb commit 475cfada33b4c13aaaca773d5485bbe26bffbf46)	2008-09-17 14:24:12 +10:00
Ronnie Sahlberg	a3bbe238c9	The ctdb daemon keeps track of whether the recovery process is running correctly by measuring how long it was since the last successful communication with the recovery daemon was recorded. After a certain timeout the ctdb daemon would deem the recovery daemon as inoperable and shut down. If the system clock is suddenly changed forward by many (60 or more) seconds this could cause the timeout to trigger prematurely/immediately where ctdb would incorrectly think that more than 60 seconds had passed since last successful communications and thus abort. Instead of cehcking for one timeout occuring, only deem the recovery daemon to be "down" and trigger a shutdown if communications have timedout for three intervals in a row. (This used to be ctdb commit 196968c552e6ebcb57389d769a4b25f42fa8bc5d)	2008-09-17 14:17:41 +10:00
Ronnie Sahlberg	ad56356005	fix a slow memory leak in the recovery daemon in the error paths for the memdump function (This used to be ctdb commit 5e641ef9d6cca286061138a9680dcf2495736e8b)	2008-09-16 09:00:48 +10:00
Ronnie Sahlberg	7b718fffd7	fix some slow memory leaks in the vacuuming handler in the recovery daemon (This used to be ctdb commit 95bf36559d62f29e6f538f3a173b504ef3258341)	2008-09-16 07:55:57 +10:00
Ronnie Sahlberg	ab3649155a	From Volker L Fix a slow memory leak in the recovery daemon if there is a recoery triggered during the public ip reassignment process (This used to be ctdb commit 0aca4daf908b76d6013ff3dfad41beb9114fc1a3)	2008-09-16 06:50:28 +10:00
Ronnie Sahlberg	3bedb7f6d1	lower the debug level for when printing that the nodeflags have changed (This used to be ctdb commit a89977f8cb2463a87147dcc0ad936cb5d4131670)	2008-09-09 13:55:31 +10:00
Ronnie Sahlberg	6474f3278d	additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c)	2008-09-09 13:44:46 +10:00
Ronnie Sahlberg	70c7525a02	zero out the address structure to keep valgrind happy (This used to be ctdb commit 8060e591b0eb2d184b5a7444487477225d2e1dbf)	2008-08-29 12:26:02 +10:00
Ronnie Sahlberg	a35fa0aa8f	rename ctdb_tcp_client back to the original name ctdb_control_tcp (This used to be ctdb commit 4d1c0418cfe6170bc081684dbe45908a5d285f0b)	2008-08-27 10:24:35 +10:00
Ronnie Sahlberg	eb23d7b6d4	we must canonicalize the sockaddr structures in killtcp so that we do the necessary downgrade if required (This used to be ctdb commit 2f8b33948e395228cbac3450c0c684e49069abf0)	2008-08-20 12:02:54 +10:00
Ronnie Sahlberg	ef997d344f	initial ipv6 patch Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> (This used to be ctdb commit 1f131f21386f428bbbbb29098d56c2f64596583b)	2008-08-19 14:58:29 +10:00
Andrew Tridgell	76528cfc6b	fixed a memory leak in the recovery daemon thanks to vl for spotting this (This used to be ctdb commit 96df98d9f86ecc6bb1a458eb2101e5c1bc0f96e6)	2008-08-11 23:33:05 +10:00
Andrew Tridgell	1431210d46	fixed send of release IP message (This used to be ctdb commit db6bc3745a56cc12e60e727190a098a6527690d6)	2008-08-08 22:06:39 +10:00
Andrew Tridgell	aa1bc0abba	added a new control CTDB_CONTROL_TRANS2_COMMIT_RETRY so we can tell the difference between a initial commit attempt and a retry, which allows us to get the persistent updates counter right for retries (This used to be ctdb commit 7f29c50ccbc7789bfbc20bcb4b65758af9ebe6c5)	2008-08-08 13:11:28 +10:00
Andrew Tridgell	5a0249d34c	return a more detailed error code from a trans2 commit error (This used to be ctdb commit 6915661a460cd589b441ac7cd8695f35c4e83113)	2008-08-08 09:58:49 +10:00
Andrew Tridgell	66d154ef5f	Merge commit 'ronnie/1.0.53' (This used to be ctdb commit 58e6dc722ad1e2415b71baf1d471885169dde14d)	2008-08-08 00:48:19 +10:00
Andrew Tridgell	5ee51ae84e	fixed a looping error bug with the new transactions code (This used to be ctdb commit 0592ba2a4fbd1b3b7a6bd0780eadbd6d449baaad)	2008-08-08 00:44:33 +10:00
Ronnie Sahlberg	31fcc1bbb2	Merge git://git.samba.org/tridge/ctdb (This used to be ctdb commit 66c61137a5c01afcbae329ffbe121e78ae087399)	2008-08-07 18:50:48 +10:00
Andrew Tridgell	bbedba23c7	cover some corner cases where the persistent database could become inconsistent (This used to be ctdb commit c76c214be401cb116265ed17ffe6c77c979ded82)	2008-08-07 13:34:18 +10:00
Ronnie Sahlberg	b9d8bb23af	remove the reclock file we store pnn counts in. This file creates additional locking stress on the backend filesystem and we may not need it anyway. (This used to be ctdb commit 84236e03e40bcf46fa634d106903277c149a734f)	2008-08-06 11:52:26 +10:00
Andrew Tridgell	78acc59784	implemented replayable transactions in ctdb to prevent deadlock (This used to be ctdb commit b6d9a0396fb4b325778d3810dc656f719f31b9f1)	2008-08-04 14:51:51 +10:00
Andrew Tridgell	cf739ac892	renamed the pulldb structure to a ctdb_marshall_buffer (This used to be ctdb commit bad53b2d342bb9760497e6f4a61e64ca50d6e771)	2008-07-30 19:59:18 +10:00
Andrew Tridgell	ca3eaf87e1	make sure we honor the TDB_NOSYNC flag from clients in the server (This used to be ctdb commit 9806d18b93218c216d538e28f9ed495269f0a938)	2008-07-30 19:58:49 +10:00
Andrew Tridgell	98502135e7	added new multi-record transaction commit code (This used to be ctdb commit 9ff3380099fe6f4d39de126db0826971a10ee692)	2008-07-30 19:57:00 +10:00
Andrew Tridgell	abe0232818	rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106)	2008-07-30 14:24:56 +10:00
Andrew Tridgell	79793708a4	fixed buffering in ctdb logging code to handle multiple lines correctly (This used to be ctdb commit e8ef9891aa31c374921b23cc74e1eda1f8218bf0)	2008-07-23 15:25:52 +10:00
Ronnie Sahlberg	1bfcca524d	From Michael Adams, change one element from private to private_data Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> (This used to be ctdb commit 0de79352c9b36c118e36905f08ebbe38ecbb957e)	2008-07-22 09:07:42 +10:00

1 2 3 4 5 ...

383 Commits