1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-12 09:18:10 +03:00
Commit Graph

1523 Commits

Author SHA1 Message Date
Ronnie Sahlberg
df5dd43e7c add a new tunable "CheckNodesFile" that when set to 0 will disable the
check in the recovery daemon that all nodes are using the same 
/etc/ctdb/nodes file.

Also add some more missing checks that the pnn used is a valid pnn 
before using it to dereferencing the ctdb->nodes array


This is useful since it allows us to add more physical nodes to a an 
existing cluster without having to bring down the entire cluster.

The to add an additional node to an existing cluster would then be
1, on all nodes set CheckNodesFile=0 using 'ctdb setvar'
2, on all nodes add CTDB_SET_CheckNodesFile=0 to /etc/sysconfig/ctdb
For each each node, one at a time :
3, use 'ctdb disable' to stop the hosted services
4, service ctdb stop
5, service ctdb start
Once all nodes have been restarted 
6, on all nodes remove CTDB_SET_CheckNodesFile=0 from 
/etc/sysconfig/ctdb
7, on all nodes set CheckNodesFile=0 using 'ctdb setvar'

8, configure and start up the new node

During this procedure, only one node at a time was brought 
down/restarted and was so only for a short period.

(This used to be ctdb commit 462501a32143e943ce350bd904a47c0955414a51)
2007-11-05 13:36:11 +11:00
Andrew Tridgell
82bd652749 patch from michael adam
(This used to be ctdb commit a7a3bef90f033bab5cb110a6ef77a8bef48f2588)
2007-11-02 13:20:29 +11:00
Ronnie Sahlberg
8b1ad1073b merge from tridge
(This used to be ctdb commit 10302eeecc36c4ce94a4e2e0e57864be790325da)
2007-11-01 09:00:14 +11:00
Andrew Tridgell
29e48fe54a increase release number
(This used to be ctdb commit dc648b1bb6becc52dcf900add97418a5634367eb)
2007-10-30 10:19:43 +11:00
Andrew Tridgell
684282f7a1 added bonding info to ctdb_diagnostics
(This used to be ctdb commit 71b5fc434bc5d88eb0669ee29aa932ba12737e07)
2007-10-30 10:18:52 +11:00
Andrew Tridgell
87bfa7d61e merge from ronnie
(This used to be ctdb commit 22b110549ff35f2560043abd5d85bed4b35295ee)
2007-10-29 13:43:12 +11:00
root
2a70ac8801 the while loop in the startup event runs as a subshell so we need an extra || exit 1 at the end
to propagate the error code back to the caller of the script

(This used to be ctdb commit c30d5c328784059949f5e82a07008e9632234f20)
2007-10-29 12:34:45 +11:00
Ronnie Sahlberg
8599f2008d if bond* interfaces are used as public interfaces we can not rely on ethtool but
have to check /proc for the status instead

(This used to be ctdb commit 4ed7747267aea265b7a71c651abf6d5db4f4718b)
2007-10-29 10:51:16 +11:00
Ronnie Sahlberg
ba6f9ae4a7 merge from tridge
(This used to be ctdb commit c7777b966f6a6e0f4126c03300338fdc822ac6c9)
2007-10-29 08:50:51 +11:00
Ronnie Sahlberg
bd73497a18 merge from tridge
(This used to be ctdb commit 919ba610c61cfaf5ecc1ab64ad8be34a80d928f4)
2007-10-29 08:40:46 +11:00
Andrew Tridgell
6d75f0703e added monitoring of ftp ports
(This used to be ctdb commit 4780e078fb55d69053f78a4bbc7c67e569bb5dae)
2007-10-26 14:53:09 +10:00
Ronnie Sahlberg
533a530177 since service nfs stop/start sometimes fail to bring up the mount daemon on rhel5
check if mountd is running during monitoring and if it is not, try to restart it

(This used to be ctdb commit 3d4b74669164b519398aeeacd59714f1e3884eff)
2007-10-23 12:35:43 +10:00
Andrew Tridgell
1d6b4f418d update release number
(This used to be ctdb commit fe6766940b2cf8a84ed51824158c956362a5806d)
2007-10-23 11:56:52 +10:00
Andrew Tridgell
2cea351f45 merge from ronnie
(This used to be ctdb commit cc70a2cc5f5400d6480cb609e1fa203236917976)
2007-10-23 11:45:36 +10:00
Ronnie Sahlberg
44ab81763d merge from tridge
(This used to be ctdb commit 938e375a80ce2f1827117c38554f576f73a5c71e)
2007-10-23 06:42:45 +10:00
Andrew Tridgell
6e6de1e4b7 fixed a problem with backgrounding onnnode
(This used to be ctdb commit 4e23630224bb219cfbbf129c4562da5a4c2d601a)
2007-10-22 21:11:02 +10:00
Andrew Tridgell
8e22bca5ca fixed a double close of a socket, leading to an EPOLL error
(This used to be ctdb commit bbe8ad842bdfedd37ef14a6be07ad939113fe9b1)
2007-10-22 16:41:11 +10:00
Ronnie Sahlberg
6a32af60b8 nfs may take a while to stop so do it in hte background
(This used to be ctdb commit 2ccaeaf6a65731c17173a4945e3e00e230e67d35)
2007-10-22 15:14:49 +10:00
Andrew Tridgell
2d8afd85d5 another place where we need to mark connect_fde as freed
(This used to be ctdb commit d047fbeafebe4b150602f9a91802795659058b16)
2007-10-22 15:13:32 +10:00
Andrew Tridgell
2931ed5d17 fixed a valgrind uninitialised memory error due to pad bytes
(This used to be ctdb commit aea9b0c8d467fe19815c046969e9c1049a3a20ac)
2007-10-22 15:13:08 +10:00
Andrew Tridgell
f09537e7f1 prevent a double free
(This used to be ctdb commit 5a1b923abb36c6deb99ae178fdd54f12235dc309)
2007-10-22 14:07:35 +10:00
Ronnie Sahlberg
1d6a74f943 when shutting down, we should stop monitoring
(This used to be ctdb commit 325683ef8f326f0565a827ff2c493adcab6e0d64)
2007-10-22 12:34:51 +10:00
Ronnie Sahlberg
4a97876fb7 when we are shutting down, we should first shut down the recovery daemon
(This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b)
2007-10-22 12:34:08 +10:00
Andrew Tridgell
3507664562 merge from ronnie
(This used to be ctdb commit b47fdc1fc86431c9159b595047faa76ba31f6829)
2007-10-22 10:26:25 +10:00
Ronnie Sahlberg
f022df1d40 dont set parameters in statd-callout if they should be set they
bshould be set from 10.interfaces

(This used to be ctdb commit 0c7c2dae0a976922de58793d576855bc37cd38e1)
2007-10-22 10:18:38 +10:00
Ronnie Sahlberg
caad5dc38d dont set some of the sysctl variables in statd-callout. these are
mainly useful for avoiding ack-storms when doing very rapid 
failover/failback during testing   but should not be required in 
real-world.

this gets rid of a lof of annoying messages from the messages file

(This used to be ctdb commit 50d289dcce2caa7c7be9b6faa3b38b69c2237038)
2007-10-21 06:42:33 +10:00
Ronnie Sahlberg
34f06575f8 merge from tridge
(This used to be ctdb commit a45cfb29d9a0babccddc6aa26e71c00524da1d97)
2007-10-19 15:19:25 +10:00
Andrew Tridgell
1a8338e443 increase release number
(This used to be ctdb commit 747ff96f1d93c52ba7548d0540266b0277d88ac1)
2007-10-19 12:22:24 +10:00
Ronnie Sahlberg
600264ac02 dont close the file, just set the fd to -1
(This used to be ctdb commit 04b26aa09e69b3c9fa1db245b5123c3cc02db8af)
2007-10-19 11:03:12 +10:00
Andrew Tridgell
f47f758fe8 merge from ronnie
(This used to be ctdb commit d444fdc7782496abe4b27003b647ac49fb52e6be)
2007-10-19 09:39:07 +10:00
Andrew Tridgell
623e216dcf remove a incorrectly added file
(This used to be ctdb commit ff01a32db81b6c04d42634f5660181c270988264)
2007-10-19 09:30:55 +10:00
Ronnie Sahlberg
e81e008a36 add missing ) in the IB transport (which i dont compile for)
(This used to be ctdb commit 7f7a184bae87d46bd589d11068b6443b007366b4)
2007-10-19 09:05:37 +10:00
Ronnie Sahlberg
fe7b5b4d85 add a stub restart method for IB
(This used to be ctdb commit d318504ad5a49dbdfa307be39ae88df839e6308d)
2007-10-19 09:04:52 +10:00
Ronnie Sahlberg
d1ba047b7f add a new transport method so that when a node is marked as dead, we
shut down and restart the transport

othervise, if we use the tcp transport the tcp connection might try to 
retransmit the queued data during the time the node is unavailable.
this together with the exponential backoff for tcp means that the tcp 
connection quickly reaches the maximum backoff rto which is often 60 or 
120 seconds.   this would mean that it could take up to 60/120 seconds 
before the tcp layer detects that the connection is dead and it has to 
be reestablished.

(This used to be ctdb commit 0256db470879ce556b0f00070f7ebeaf37e529ab)
2007-10-19 08:58:30 +10:00
Ronnie Sahlberg
755511d28d set the flags explicitely isnstead of masking them in
(This used to be ctdb commit 27a5f9dead44890683f9dbc4f07cda11264aa03b)
2007-10-18 16:54:00 +10:00
Andrew Tridgell
b814462c38 added some debug lines to help track down a problem
(This used to be ctdb commit 2ca31e9de179f76e392a26cc8305e2473357c760)
2007-10-18 16:27:36 +10:00
Ronnie Sahlberg
06cdb1ff31 merge from tridge
(This used to be ctdb commit ad03e63906270c9c076ffdb1f62f912bb414ea10)
2007-10-18 15:53:50 +10:00
Andrew Tridgell
5e3d5b1314 merge from ronnie
(This used to be ctdb commit a6b094fdede0ae850e87877fad0b9dd1f3a26869)
2007-10-18 15:51:15 +10:00
Andrew Tridgell
d939a2901b merge from ronnie
(This used to be ctdb commit 75d4b386293e186a6bb8532515585ab72670d663)
2007-10-18 15:44:02 +10:00
Ronnie Sahlberg
e4ec6e9d6b flush the route cache when we have added the single public ip to the
node

cleanup and remove everything when we do a shutdown event

(This used to be ctdb commit 221432f45073bc7624803058c8bbf18838e7ceeb)
2007-10-18 14:13:48 +10:00
Ronnie Sahlberg
537841fadb use NF_DROP instead of NF_STOLEN when we tell the kernel to not worry
about this packet any more and just forget it ever saw it

(This used to be ctdb commit 42a2a777cbc15a8cbbea7ecf2fb1c6dafa242d0c)
2007-10-17 15:03:58 +10:00
Ronnie Sahlberg
9a93f4b8df reverse the order in which public ips are listed so it matches the order
of the public_addresses file

(This used to be ctdb commit ce987661edd9160982e65866fb773445d296e5c7)
2007-10-17 13:42:42 +10:00
Ronnie Sahlberg
805ba22d65 merge from tridge
(This used to be ctdb commit 87760a95ec0a9e3cb2c415c569235a1ff58318cb)
2007-10-17 10:10:52 +10:00
Andrew Tridgell
85f91b9d5c increase release number
(This used to be ctdb commit 69fe7ce1d7874ce51d79de29adc53c207cb8869f)
2007-10-16 20:14:04 +10:00
Andrew Tridgell
6b9d73a96d more detail on multipath config
(This used to be ctdb commit 78c44f2267cbef5fbc57d56dfd5ff40972733a1f)
2007-10-16 20:13:28 +10:00
Ronnie Sahlberg
ce7a054d20 add back the test inside the daemon that if someone asks us to drop
recovery mode back to NORMAL that we can not lock the reclock file   
since at this stage it MUST be locked by the recovery daemon.

in order to avoid a non-blocking fnctl() lock from blocking and cause 
"issues"  we move the 'test that we can not lock reclock file' into a 
child process.

(This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e)
2007-10-16 15:27:07 +10:00
Ronnie Sahlberg
056aac6e0c add a new tunable : DeterministicIPs that makes the allocation of
public addresses to nodes deterministic.

Activate it by adding CTDB_SET_DeterministicIPs=1 in /etc/sysconfig/ctdb

When this is set,    the first entry in /etc/ctdb/public_addresses will 
always be hosted by node 0, when that node is available, the second 
entry by node1 and so on.

This tunable allows the allocation of addresses to become very 
unbalanced and is only for debugging/testing use.
Beware, this feature requires that /etc/ctdb/public_addresses are 
identical on all the nodes in the cluster.

(This used to be ctdb commit f0ca221f235731542090d8a6c86f2b7cd2ce2f96)
2007-10-16 12:15:02 +10:00
Ronnie Sahlberg
25d3a031d0 include system/network.h so we get the prototype for inet_aton()
(This used to be ctdb commit 7145764b2d217f88a723dcb0ffd4e5a1567d64cf)
2007-10-16 11:29:33 +10:00
Ronnie Sahlberg
7e2e1b14fb merge from tridge
(This used to be ctdb commit 9e6bc12c9be2dabcfb9c6aeef257ef4737287fab)
2007-10-16 11:26:22 +10:00
Ronnie Sahlberg
b3ff7d904d dont try to lock the file from inside the ctdb daemon.
eventhough we dont want a blocking lock it does appear that the fcntl()
call can block for a while if gpfs is in the process of rebuilding 
itself after a node arriving/leaving the cluster

(This used to be ctdb commit 6c0d206dea7116db71bccb4802a93dd7283249f6)
2007-10-16 09:50:31 +10:00