1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-27 14:04:05 +03:00

205 Commits

Author SHA1 Message Date
Ronnie Sahlberg
755511d28d set the flags explicitely isnstead of masking them in
(This used to be ctdb commit 27a5f9dead44890683f9dbc4f07cda11264aa03b)
2007-10-18 16:54:00 +10:00
Andrew Tridgell
b814462c38 added some debug lines to help track down a problem
(This used to be ctdb commit 2ca31e9de179f76e392a26cc8305e2473357c760)
2007-10-18 16:27:36 +10:00
Andrew Tridgell
d939a2901b merge from ronnie
(This used to be ctdb commit 75d4b386293e186a6bb8532515585ab72670d663)
2007-10-18 15:44:02 +10:00
Ronnie Sahlberg
ce7a054d20 add back the test inside the daemon that if someone asks us to drop
recovery mode back to NORMAL that we can not lock the reclock file   
since at this stage it MUST be locked by the recovery daemon.

in order to avoid a non-blocking fnctl() lock from blocking and cause 
"issues"  we move the 'test that we can not lock reclock file' into a 
child process.

(This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e)
2007-10-16 15:27:07 +10:00
Ronnie Sahlberg
056aac6e0c add a new tunable : DeterministicIPs that makes the allocation of
public addresses to nodes deterministic.

Activate it by adding CTDB_SET_DeterministicIPs=1 in /etc/sysconfig/ctdb

When this is set,    the first entry in /etc/ctdb/public_addresses will 
always be hosted by node 0, when that node is available, the second 
entry by node1 and so on.

This tunable allows the allocation of addresses to become very 
unbalanced and is only for debugging/testing use.
Beware, this feature requires that /etc/ctdb/public_addresses are 
identical on all the nodes in the cluster.

(This used to be ctdb commit f0ca221f235731542090d8a6c86f2b7cd2ce2f96)
2007-10-16 12:15:02 +10:00
Ronnie Sahlberg
25d3a031d0 include system/network.h so we get the prototype for inet_aton()
(This used to be ctdb commit 7145764b2d217f88a723dcb0ffd4e5a1567d64cf)
2007-10-16 11:29:33 +10:00
Ronnie Sahlberg
7e2e1b14fb merge from tridge
(This used to be ctdb commit 9e6bc12c9be2dabcfb9c6aeef257ef4737287fab)
2007-10-16 11:26:22 +10:00
Ronnie Sahlberg
b3ff7d904d dont try to lock the file from inside the ctdb daemon.
eventhough we dont want a blocking lock it does appear that the fcntl()
call can block for a while if gpfs is in the process of rebuilding 
itself after a node arriving/leaving the cluster

(This used to be ctdb commit 6c0d206dea7116db71bccb4802a93dd7283249f6)
2007-10-16 09:50:31 +10:00
Andrew Tridgell
99bc0aca93 sync flags between nodes in monitor loop in recmaster
(This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210)
2007-10-15 14:28:51 +10:00
Andrew Tridgell
0e855c0772 merge from ronnie
(This used to be ctdb commit d18712caba11855010be52f90bac656683076676)
2007-10-15 14:17:49 +10:00
Andrew Tridgell
174879621e add config option for disabling bans
(This used to be ctdb commit 153b911f7f957d4c564b04f5aa878033a02da9e4)
2007-10-15 13:22:58 +10:00
Ronnie Sahlberg
1a4999076b first check that recovery master is connected (we know this from our own
flags)

then pull the flags off recovery master before checking if it is banned

(This used to be ctdb commit 94c1d234e57a40eda2d8b892dd9fbe1ffc4b3433)
2007-10-11 07:10:17 +10:00
Ronnie Sahlberg
167e100d4b simplify election handling
make sure we read and update the flags from all remote nodes before we 
reach the first codepath that can call do_recovery()
since during do_recovery() we need to know what the flags are.

(This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f)
2007-10-11 06:16:36 +10:00
Ronnie Sahlberg
33a6aa3c3f merge from tridge
(This used to be ctdb commit 4690a205fe4325b03ab044bdb5fbc9aa3e94db6e)
2007-10-10 10:49:55 +10:00
Andrew Tridgell
011a205b86 make sure reconnected nodes start off as unhealthy so they don't get a public IP
(This used to be ctdb commit c733ec6760cae01ce277f491caf1355e46de5cf7)
2007-10-10 10:45:22 +10:00
Ronnie Sahlberg
bdd67bba1e add a --single-public-ip argument to ctdbd to specify the ip address
used in single public ip address mode.
when using this argument, --public-interface must also be used.

add a vnn structure to the ctdb context to describe the single public ip 
address


update the killtcp control in the daemon that if a socketpair that is to 
be killed does not match a normal public address it checks if the 
destination address maches the single public ip address and if so uses 
that vnn structure from the ctdb context


this allows killtcp to kill also connections to the single public ip 
instead of only normal public addresses

(This used to be ctdb commit 5661ba17b91f62821dec1c76056c78b99752a90b)
2007-10-10 09:42:32 +10:00
Ronnie Sahlberg
7735957693 remove some debug outputs
(This used to be ctdb commit f29c0b52df1f455909ba133e3ad3bc462dc32929)
2007-10-09 13:45:42 +10:00
Ronnie Sahlberg
80cd82f8e4 add a control to send gratious arps from the ctdb daemon
(This used to be ctdb commit 563819dd1acb344f95aabb4bad990b36f7ea4520)
2007-10-09 11:56:09 +10:00
Ronnie Sahlberg
de6c5ed14d merge from tridge
(This used to be ctdb commit 02cda01c032804cb1c53593ceb98685c827e2d58)
2007-10-06 08:11:24 +10:00
Andrew Tridgell
50770008df fixed several places where we set the recovery culprit incorrectly
(This used to be ctdb commit d9da73395fa443801fc68ec53a42b548e832d58a)
2007-10-05 13:51:31 +10:00
Andrew Tridgell
4115492992 - catch ESTALE in the recovery lock by trying a read()
- priortise nodes that are unbanned and healthy in the election

(This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f)
2007-10-05 13:28:21 +10:00
Andrew Tridgell
fb48f2d5a2 we are the culprit if we can't get the reclock
(This used to be ctdb commit 1d320e113c6134ff6822b985a47131d8204af35a)
2007-10-05 12:01:40 +10:00
Ronnie Sahlberg
72379ee3eb change async.private to async.private_data since private is a reserved
work in c++

(This used to be ctdb commit 79eb28f6cd5dcc30b04966d202a050eaf98a2552)
2007-09-26 14:25:32 +10:00
Ronnie Sahlberg
359448ff00 when we have a public ip address mismatch (i.e. we hold addresses we
shouldnt   or we are not holding addresses wqe should)
we must first freeze the local node before we set the recovery mode

(This used to be ctdb commit a77a77e8b5180f6a4a1f3d7d4ff03811f3b71b56)
2007-09-24 10:52:26 +10:00
Andrew Tridgell
e3d0ec8797 fixed a fd leak on the recovery lock
(This used to be ctdb commit 186f35c42ed4fcc9ed44390b0dd036ece475d45e)
2007-09-24 10:19:07 +10:00
Andrew Tridgell
80100c3573 run monitoring more quickly when unhealthy and at startup
(This used to be ctdb commit ff1c205928e3ef5bcc6bf4e4b2122a19fa38d8f4)
2007-09-24 10:12:18 +10:00
Andrew Tridgell
b87ddd9148 no longer wait at startup for services to become available, instead
set the node initially unhealthy and let the status monitoring bring the node online.
This fixes a problem with winbindd, where it refused to start because secrets.tdb was not populated
but we could not populate ctdbd, because the net command would not run while ctdbd was still doing startup
and thus frozen
(This used to be ctdb commit 3a001b793dd76fb96addf1e2ccb74da326fbcfbc)
2007-09-24 10:00:14 +10:00
Andrew Tridgell
4178cb98a1 fixed a valgrind error, and some warnings
(This used to be ctdb commit c0f52dbb385fa0748680adb7c40755c92e577551)
2007-09-24 09:57:14 +10:00
Andrew Tridgell
2607c222fc avoid using connected nodes that aren't in the vnn map yet
(This used to be ctdb commit 2b5ae133f5f6fa9ad1a8896fe4b4c542d4ca462d)
2007-09-21 15:44:13 +10:00
Ronnie Sahlberg
51d912063c in ctdb_control_persistent_store() we must talloc_steal() the pointer to
c   to prevent it from being immediately freed (and our persistent store 
state with it) if we need to wait asynchronously for other nodes before 
we can reply back to the client

(This used to be ctdb commit fa5915280933e4d2e7d4d07199829c9c2b87a335)
2007-09-21 15:19:33 +10:00
Ronnie Sahlberg
61e885d0b9 when ctdb attaches to a database it broadcasts the attach to all other
nodes so that the db is created on them as well

when we send this broadcast   we must use the correct control and not 
assume all databases created are of the temporary kind 

(This used to be ctdb commit 106f816d4a0814ca4418de051289d9fc62df7dd2)
2007-09-21 13:47:40 +10:00
Andrew Tridgell
c60988325d added support for persistent databases in ctdbd
(This used to be ctdb commit 3115090a0d882beca9d70761130b74bb0821f201)
2007-09-21 12:24:02 +10:00
Andrew Tridgell
81bfa58d58 make sure we set close on exec on any possibly inherited fds
(This used to be ctdb commit d9dec82076f14a348e7b67b4350180681ff86f32)
2007-09-19 11:46:37 +10:00
Andrew Tridgell
c62490569b cope with non-standard install dirs in event scripts
(This used to be ctdb commit 52fff5345873690a9cc86495f414343eaa3bd540)
2007-09-14 14:14:03 +10:00
Andrew Tridgell
955d4d8615 make sure all public IPs are removed at startup
(This used to be ctdb commit b16f33787f2a9471285037f4a6d470e826536570)
2007-09-14 11:56:40 +10:00
Ronnie Sahlberg
6052078b53 let each node verify that they have a correct assignment of public ip
addresses (i.e. htey hold those they should hold   and they dont hold 
any of those they shouldnt hold)

if an inconsistency is found, mark the local node as recovery mode 
active
and wait for the recovery master to trigger a full blown recovery

(This used to be ctdb commit 55a5bfc8244c5b9cdda3f11992f384f00566b5dc)
2007-09-14 10:16:36 +10:00
Andrew Tridgell
42fc00bda9 - merge from ronnie
- add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring

(This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b)
2007-09-14 09:49:12 +10:00
Ronnie Sahlberg
4186d8eaba when a ctdb_takeover_run has failed we must make sure that
need_takeover_run is set to true  or else we might forget to rerun it 
again during the next recovery


othervise,  need_takeover_run is only set to true IFF the node flags for 
a remote node and the local nodes differ.
It is possible that a takeover run fails  and thus the reassignment of 
ip addresses is incomplete  but before we get back to the test in    
monitor_cluster()  that all the node flags of all nodes have converged 
and they now match each others again.   and thus causing 
monitor_cluster() to fail to realize that a takeover run is needed.

(This used to be ctdb commit ae7e866787cebd14394983ce1834387c959d1022)
2007-09-13 14:51:37 +10:00
Andrew Tridgell
9d50595b8a prevent recursion in the calling of ctdb_takeover_run
(This used to be ctdb commit 0fbdeb7c91b965d9bc5ecc7b24e31070378d8f1d)
2007-09-13 14:08:18 +10:00
Andrew Tridgell
30de14fe79 force recovery if unable to tell a node to release an IP
(This used to be ctdb commit 6895788d2499344a03357e5c1103cb8383e9eaf7)
2007-09-13 11:19:49 +10:00
Andrew Tridgell
3c0f61cb92 we don't need the is_loopback logic in ctdb any more
(This used to be ctdb commit 4ecf29ade0099c7180932288191de9840c8d90a9)
2007-09-13 10:45:06 +10:00
Andrew Tridgell
67bd64ef35 - don't allow the registration of clients with IPs we don't hold
- change some debug levels to make tracking of IP release problems easier
(This used to be ctdb commit 5f9aed62adaf87750f953412c55b29c58e4bb6c0)
2007-09-12 13:22:31 +10:00
Andrew Tridgell
a478c78f03 changed some debug levels
(This used to be ctdb commit ed764533e1c2f8982e1577ca5e7f5f4482a15345)
2007-09-12 13:21:19 +10:00
Andrew Tridgell
5b65a6c7f0 get interface right
(This used to be ctdb commit e0edc38d7e897f7de2850eb2cfd17fea75c16fcc)
2007-09-10 20:45:27 +10:00
Andrew Tridgell
8cd7ca149e fixed a pointer cast warning
(This used to be ctdb commit df0e7a4aa13112d613702d8ea0fb0e18510d293c)
2007-09-10 15:16:17 +10:00
Andrew Tridgell
f3ae1cdb02 - use struct sockaddr_in more consistently instead of string addresses
- allow for public_address lines with a defaulting interface

(This used to be ctdb commit 29cb760f76e639a0f2ce1d553645a9dc26ee09e5)
2007-09-10 14:27:29 +10:00
Andrew Tridgell
70ec39b1b1 add back in --public-interface as a default
(This used to be ctdb commit cdf56daf69b2c8381ee673943e982ad20f19affd)
2007-09-10 14:26:35 +10:00
Andrew Tridgell
42168177ef merge from ronnie
(This used to be ctdb commit 1f21d4d563232926c35d03c4d69eb69190823dc6)
2007-09-10 13:21:11 +10:00
Ronnie Sahlberg
4ac749bfa4 change the signature to ctdb_sys_have_ip() to also return:
a bool that specifies whether the ip was held by a loopback adaptor or 
not
 the name of the interface where the ip was held

when we release an ip address from an interface, move the ip address 
over to the loopback interface

when we release an ip address  after we have move it onto loopback, 
use 60.nfs to kill off the server side (the local part) of the tcp 
connection   so that the tcp connections dont survive a 
failover/failback

61.nfstickle,   since we kill hte tcp connections when we release an ip 
address   we no longer need to restart the nfs service in 61.nfstickle

update ctdb_takeover to use the new signature for ctdb_sys_have_ip

when we add a tcp connection to kill in ctdb_killtcp_add_connection()
check if either the srouce or destination address match a known public 
address

(This used to be ctdb commit f9fd2a4719c50f6b8e01d0a1b3a74b76b52ecaf3)
2007-09-10 07:20:44 +10:00
Ronnie Sahlberg
e4eeceaf3a dont dereference vnn before we have assigned it a pointer value
(This used to be ctdb commit 2a8fc69aea8527b22a3fe57427677e4caff57338)
2007-09-05 14:29:44 +10:00