1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-24 21:34:56 +03:00
Commit Graph

95 Commits

Author SHA1 Message Date
Ronnie Sahlberg
fa872de664 60.nfs:
we must always restart the lockmanager when the cluster has been 
reconfigured and ip addresses has changed. This is to make sure we get a 
clusterwide grace period for nfs locking.
if we dont do this and only restart locking on the nodes that were 
direclty affected, a different client can take out a conflicting lock 
from a different node before affected clients has had a chance to
reclaim all the locks lost during reconfigure.
grace period on rhel5 kernel has bene increased to 90 seconds!

statd-callout:
we must restart lockmanager to ensure a clusterwide grace period for 
nfs. this makes locking "more correct" for nfs clients and prevents
other clients/nodes from taking out a conflicting lock while a different
client/node tries to reclaim lost locks.
This makes it "almost consistent" for NFS clients   but there is still 
the possibility that a cifs client can take out a conflicting lock 
before an nfs client has had a chance to reclaim an existing lock.
This can not be solved with anything less than making the kernel nfs 
lock manager "samba aware" and making samba aware of the internal state 
of the kernel lock manager so that they can cooperate.

we can not just stop/start the lockmanager back to back in rhel5 since 
if they are stopped/started too close to eachother then when the new 
lockmanager upon starting up sends out statd notifications two things 
can happen:
1, new lockmanager sends out notification BEFORE it has registered with 
portmapper leading to 
  lockmanager starts
  lockmanager sends notification to the client
  client tries to recover the lock and tries to portmap the lockmanager
  port on the server.
  server is not (yet) registered with portmapper and server responds
  "no such program" to hte clients request to discover where lockmanager
   is.
  client then just completely gives up reclaiming the lock and doesnt 
  even reattempt the portmapper call after some timeout.
  ==> lock reclaim failed.
2, if they are started back to back, and a client tries to reclaim the
   lock  the lockmanager sometimes sends two responses back to back
   to the client.   one with status NLM_GRANTED (==you got the lock 
reclaimed) and one with status NLM_DENIED (==you could not get the lock 
reclaimed)
   This confuses the client and leads to the server thinking that the 
client does have the lock   and the client thinking it has not got the 
lock    and orphaned locks result.


We also send out additional notification messages of different formats
to allow more legacy clients to interoperate with locking.

(This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033)
2007-09-07 08:52:56 +10:00
Ronnie Sahlberg
00453a375a improve the handling of hosts to notify with statd
(This used to be ctdb commit cc87bda7e344bc777b9620a6211e62de4dce4e3b)
2007-09-06 11:30:49 +10:00
Ronnie Sahlberg
46eecfea27 we dont use 'sendip' any more so dont check for it and exit from the
61.nfstickles script if it is missing from the host

(This used to be ctdb commit 8eac441e24f4ef33b55f9eaa4856b5c1e1c15213)
2007-09-05 15:39:51 +10:00
Ronnie Sahlberg
12ebb74838 change how we do public addresses and takeover so that we can have
multiple public addresses spread across multiple interfaces on each 
node.

this is a massive patch since we have previously made the assumtion that 
we only have one public address per node.

get rid of the public_interface argument.  the public addresses file 
now explicitely lists which interface the address belongs to

(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
2007-09-04 09:50:07 +10:00
Ronnie Sahlberg
4e61e05f49 when we start 60.nfs we must make sure that the shared storage
nfs-state directory actually exists (by creating it)
or else the lock manager will not start 

(This used to be ctdb commit f2d15d04df842538c8d8331796a3c6fbe23463f2)
2007-08-30 15:27:45 +10:00
Ronnie Sahlberg
1ee8c79db7 start winbind before smbd
(This used to be ctdb commit d6a2e22a6d688cfcf5631c8de68fc8ef721635d6)
2007-08-16 11:34:35 +10:00
Ronnie Sahlberg
ce91401724 we should start winbindd before we start smb
(This used to be ctdb commit 03aad3ea55c4816a3790ac9336026b4872a65310)
2007-08-16 11:18:16 +10:00
Ronnie Sahlberg
3b9d50f3ee change the now rather small /etc/ctdb/events script into a service
specific script /etc/ctdb/events.d/00.ctdb

get rid of CTDB_EVENTS_SCRIPT and --event-script

(This used to be ctdb commit 81ccfaf838e5772d4a58eb6a70224b7b39aba9f3)
2007-08-15 15:01:31 +10:00
Ronnie Sahlberg
1fa787e667 fix typo
(This used to be ctdb commit c7a8e7b506f98240c0e9f705fe1f504a6a56a332)
2007-08-15 11:38:27 +10:00
Ronnie Sahlberg
83dbfecad7 add a description on how the event scripts works to the README and make
sure it is installed in /etc/ctdb/events.d

(This used to be ctdb commit adec62a924af5bb023f346e705515b09dbe64f21)
2007-08-15 11:36:01 +10:00
Andrew Tridgell
fb22d3bd2c merged from ronnie
(This used to be ctdb commit 765b07fa5d1af07c8c7212d19d8e9574060b3039)
2007-07-18 20:13:57 +10:00
Ronnie Sahlberg
7e532f8f83 we dont do nfstickles unless ctdb manages nfs
(This used to be ctdb commit 0622b4a969abdc8bd11f200ed5ae1c7b1d188db7)
2007-07-15 11:43:11 +10:00
Ronnie Sahlberg
643b87fbae fix bug introduced in previous commit
(This used to be ctdb commit 8396a7500225c90165ebcfbdc2c65673740e6b25)
2007-07-15 11:37:22 +10:00
Ronnie Sahlberg
e96f733052 there is no point in doing anything in 10.interfaces unless we have a
public interface

(This used to be ctdb commit c0335ee92b16a1e2dfcb37a39872b66a35b0ab94)
2007-07-15 11:28:53 +10:00
Ronnie Sahlberg
4b6d9485ab ctdb killtcp no longer takes a <numrst> argument to control how many
times to try the reset.

the reset retry attempt is now handled inside the daemon

update the 60.nfs script and remove this parameter that is no longer 
used

(This used to be ctdb commit 30fb09b8b9a989e5cfe86b6daf2dcd2487013344)
2007-07-12 08:31:56 +10:00
Ronnie Sahlberg
ed1a52b293 use the socketkiller to kill off all lock manager sessions as well
(This used to be ctdb commit 980b090001ed3a77001e2a3bfc1b03833498f434)
2007-07-10 13:09:35 +10:00
Ronnie Sahlberg
d81bca2072 make it possible to specify how many times ctdb killtcp will try to RST
the tcp connection

change the 60.nfs script to run ctdb killtcp in the foreground so we 
dont get lots of these running in parallel when there are a lot of tcp 
connections to rst

(This used to be ctdb commit d81616214752882242f2886e94681972a790db80)
2007-07-10 10:24:20 +10:00
Ronnie Sahlberg
1c32f65ee0 run the ctdb killtcp in the background
(This used to be ctdb commit d6a514c2b3d427099ed669eef104146608378fa8)
2007-07-10 10:07:26 +10:00
Ronnie Sahlberg
dbc66d054b dont restart the tcp service after a ip takeover, it is more efficient
to just kill off the tcp connections

(This used to be ctdb commit bc481c3f1a44c50648488c4f8a7f15ec395d446f)
2007-07-10 09:45:14 +10:00
Ronnie Sahlberg
34e2c73020 use 'ctdb tickle' instead of sendip to tickle nfs clients.
(This used to be ctdb commit 2204cc77ce6b1dd6bb0118f57cfa05f0c8826c3e)
2007-07-06 11:51:34 +10:00
Ronnie Sahlberg
72265dd5bd remove 59.nfslock and fold this into 60.nfs
add a 61.nfstickle script to make nfs failover faster

(This used to be ctdb commit da71fa874d49346d229307d424f889994a205c89)
2007-07-06 10:54:42 +10:00
Andrew Tridgell
f532ada445 run smbstatus every 10 minutes to scrub databases
(This used to be ctdb commit cd119cdb9a1a7e0545f1c33a2a156a3d3c5d7645)
2007-06-18 03:15:08 +10:00
Andrew Tridgell
669a6b13f9 merge from ronnie
(This used to be ctdb commit 7bfc1be6dff5bd5acadfa8a3fd8f00a8ce87ca54)
2007-06-18 03:10:50 +10:00
Ronnie Sahlberg
d2ada57f60 add a mechanism to the samba event script to do periodic cleanup of the
databases once every 60 minutes

(This used to be ctdb commit 8762e08284343bf68bfed90838483e5d53db24ce)
2007-06-18 02:34:29 +10:00
Andrew Tridgell
732353de5f - merged ctdb_store test from ronnie
- added DatabaseHashSize tunable
- added logging of events inside recovery (for timing)

(This used to be ctdb commit 3593cdb928b91e217faf1b3c537fa28dc82cdace)
2007-06-17 23:31:44 +10:00
Andrew Tridgell
9d0a595594 check winbind in monitoring event too
(This used to be ctdb commit bccba656c21d0edbd9840401a3c43a76b1b3bc05)
2007-06-17 12:05:29 +10:00
Andrew Tridgell
d683080b08 - wait for winbind on samba start
- use $PATH for ctdb status

(This used to be ctdb commit cf8d837cead1cbcb22c71ebbc3947970d1a565a3)
2007-06-17 11:57:42 +10:00
Ronnie Sahlberg
741af6a468 note that there is no link on the PUBLIC interface
(This used to be ctdb commit 3582f12f837dbd3c866cdffd2e7f5c20bae59d10)
2007-06-14 17:26:42 +10:00
Andrew Tridgell
8120da0e9d fixed testparm calls
(This used to be ctdb commit 0835abffc0caa2a04cb717a636e77c71355f3c80)
2007-06-11 13:56:50 +10:00
Andrew Tridgell
2703ba210d merge from ronnie
(This used to be ctdb commit 1a0bd55dd27939110385e00dad73726a8ba66747)
2007-06-11 09:43:23 +10:00
Ronnie Sahlberg
47edceec09 when public interface is not set, print this to the logfile before
exiting the script

(This used to be ctdb commit 79f4a9faea7583aad6f39733d019ba416a4be6e5)
2007-06-11 08:42:51 +10:00
Andrew Tridgell
4e0b95ec9c newer versions of ip need the mask on del
(This used to be ctdb commit b5b13125506256f9ef6599498ee046e73b52df66)
2007-06-09 21:46:42 +10:00
Andrew Tridgell
d1c225a0b9 disable a node if testparm thinks there is a error, or warning, or an unrecognised option
(This used to be ctdb commit ded80c83002a267996b4616e3702988b821cd422)
2007-06-06 19:46:25 +10:00
Andrew Tridgell
76b7361c7e - added monitoring of rpc ports for nfs, and of Samba ports and directories
- added monitoring of the ethernet link state

When monitoring detects an error, the node loses its public IP address

(This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501)
2007-06-06 12:08:42 +10:00
Ronnie Sahlberg
317dec2f9e merge from tridge
(This used to be ctdb commit 5f1f889e0e124c5275463795c004ae971945e1ae)
2007-06-05 18:16:45 +10:00
Ronnie Sahlberg
96a12cc4ab add a simple events script to manage vsftpd
(This used to be ctdb commit 413efc7af529e4ebda6f7ea6e36f79ba72a2d1d9)
2007-06-05 18:14:01 +10:00
Andrew Tridgell
ac55bc4166 first step in health monitoring of cluster nodes. When not healthy they will be marked disabled
(This used to be ctdb commit d3dbd9fc4db21632075b56fc52cf95435c63374a)
2007-06-05 17:43:19 +10:00
Andrew Tridgell
ee747b5bd6 set close on exec on pipe in event scripts, so long running scripts don't hold the pipe
(This used to be ctdb commit 22662614b4091a4e4282e63d6876097cbf3e3d6e)
2007-06-05 15:18:37 +10:00
Ronnie Sahlberg
32d19d3791 dont use CTDB_MANAGES_NFS for controlling the lockmanager
use a dedicated variable CTDB_MANAGES_NFSLOCK   since some might want to 
use nfs but no lockmanager

(This used to be ctdb commit 1e8cec86617ffb188bd49c70f074a4b350d3fe3d)
2007-06-05 12:43:35 +10:00
Andrew Tridgell
bd81cc521d ignore commented out entries in /etc/exports
(This used to be ctdb commit d316b49ba46e819359f045adfd87da92860fd1b5)
2007-06-04 23:54:22 +10:00
Andrew Tridgell
dbb2ec43dd added tunables settable using ctdb command line tool
(This used to be ctdb commit 73d440f8cb19373cfad7a2f0f0ca4f963c57ff29)
2007-06-04 19:53:19 +10:00
Andrew Tridgell
cc9f6d30d8 split out the basic interface handling, and run event scripts in a deterministic order
(This used to be ctdb commit 399e993a4a233a5953e1e7264141e5c7c8c8c711)
2007-06-04 15:09:03 +10:00
Andrew Tridgell
e763874872 make the init scripts more portable about location of system config files
(This used to be ctdb commit 65f3e2bc722e314b2c51c3bfdc544b408a8a64cf)
2007-06-03 22:07:07 +10:00
Andrew Tridgell
b4542aa00a don't start nfs services unless the relevant directories are available
(This used to be ctdb commit e0468d61119b6581f5ec458641568d03714a5786)
2007-06-03 14:39:27 +10:00
Andrew Tridgell
794d6dd59d move config files to config/ directory
(This used to be ctdb commit f95de519b885c8e1f40df0cda70fd796e479a22a)
2007-06-02 19:40:07 +10:00