1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-08 21:18:16 +03:00
Commit Graph

1224 Commits

Author SHA1 Message Date
Andrew Tridgell
023b885793 new approach for killing TCP connections on IP release
(This used to be ctdb commit c33a0db29b5604966f582b1f8c5fd66760c72197)
2007-09-13 10:24:48 +10:00
Andrew Tridgell
1b53ecc445 remove clutter from ctdb log file
(This used to be ctdb commit 54d5dcaaee0498f40bbee5059cc72d0ca75d33b7)
2007-09-13 10:03:18 +10:00
Andrew Tridgell
96c54c6188 handle hung or slow ctdb daemons on shutdown
(This used to be ctdb commit a3089211782ab12387c1b04efa28914c94d89b30)
2007-09-12 13:26:24 +10:00
Andrew Tridgell
6c77184d96 - set arp_ignore to prevent replying to arp requests for addresses on loopback
- put removed IPs on loopback with scope host
- check for nul strings in ethtool call
;

(This used to be ctdb commit e2df1d6d08e67a36ff05a590a34c56e900741287)
2007-09-12 13:23:36 +10:00
Andrew Tridgell
a6728e0520 fixed location of arp_filter
(This used to be ctdb commit ea239c82fca2b9a648d21e5c603e632011958452)
2007-09-11 16:38:32 +10:00
Andrew Tridgell
57d8102cf8 added back --public-interface to startup script
(This used to be ctdb commit 9e9cb3c0da7251f522c655366ef0868037577a9c)
2007-09-10 15:09:28 +10:00
Ronnie Sahlberg
50381480eb update a comment
(This used to be ctdb commit e7d3ef4443686529299e8f293398cc0522235627)
2007-09-10 07:45:57 +10:00
Ronnie Sahlberg
4ac749bfa4 change the signature to ctdb_sys_have_ip() to also return:
a bool that specifies whether the ip was held by a loopback adaptor or 
not
 the name of the interface where the ip was held

when we release an ip address from an interface, move the ip address 
over to the loopback interface

when we release an ip address  after we have move it onto loopback, 
use 60.nfs to kill off the server side (the local part) of the tcp 
connection   so that the tcp connections dont survive a 
failover/failback

61.nfstickle,   since we kill hte tcp connections when we release an ip 
address   we no longer need to restart the nfs service in 61.nfstickle

update ctdb_takeover to use the new signature for ctdb_sys_have_ip

when we add a tcp connection to kill in ctdb_killtcp_add_connection()
check if either the srouce or destination address match a known public 
address

(This used to be ctdb commit f9fd2a4719c50f6b8e01d0a1b3a74b76b52ecaf3)
2007-09-10 07:20:44 +10:00
Ronnie Sahlberg
0ebd7beb4b set /proc/sys/net/ipv4/conf/all/arp_filter to 1 by default when
10.interfaces startsup

this setting makes the system only respond to APR requests from the NIC 
where the ip address is tied to and adds to the 
"principle of least surprise" when using multihoming servers

(This used to be ctdb commit 39ddf347dc45f599964a4c17e67e71faed00e544)
2007-09-08 08:09:02 +10:00
Ronnie Sahlberg
eb7a15730e add a short delay after stopping nfslock to make it less likely that
"weird" things happen

(This used to be ctdb commit 4934c083cbcc19714094e08a0b7da1fb6fdc8a5a)
2007-09-07 12:14:53 +10:00
Ronnie Sahlberg
fa872de664 60.nfs:
we must always restart the lockmanager when the cluster has been 
reconfigured and ip addresses has changed. This is to make sure we get a 
clusterwide grace period for nfs locking.
if we dont do this and only restart locking on the nodes that were 
direclty affected, a different client can take out a conflicting lock 
from a different node before affected clients has had a chance to
reclaim all the locks lost during reconfigure.
grace period on rhel5 kernel has bene increased to 90 seconds!

statd-callout:
we must restart lockmanager to ensure a clusterwide grace period for 
nfs. this makes locking "more correct" for nfs clients and prevents
other clients/nodes from taking out a conflicting lock while a different
client/node tries to reclaim lost locks.
This makes it "almost consistent" for NFS clients   but there is still 
the possibility that a cifs client can take out a conflicting lock 
before an nfs client has had a chance to reclaim an existing lock.
This can not be solved with anything less than making the kernel nfs 
lock manager "samba aware" and making samba aware of the internal state 
of the kernel lock manager so that they can cooperate.

we can not just stop/start the lockmanager back to back in rhel5 since 
if they are stopped/started too close to eachother then when the new 
lockmanager upon starting up sends out statd notifications two things 
can happen:
1, new lockmanager sends out notification BEFORE it has registered with 
portmapper leading to 
  lockmanager starts
  lockmanager sends notification to the client
  client tries to recover the lock and tries to portmap the lockmanager
  port on the server.
  server is not (yet) registered with portmapper and server responds
  "no such program" to hte clients request to discover where lockmanager
   is.
  client then just completely gives up reclaiming the lock and doesnt 
  even reattempt the portmapper call after some timeout.
  ==> lock reclaim failed.
2, if they are started back to back, and a client tries to reclaim the
   lock  the lockmanager sometimes sends two responses back to back
   to the client.   one with status NLM_GRANTED (==you got the lock 
reclaimed) and one with status NLM_DENIED (==you could not get the lock 
reclaimed)
   This confuses the client and leads to the server thinking that the 
client does have the lock   and the client thinking it has not got the 
lock    and orphaned locks result.


We also send out additional notification messages of different formats
to allow more legacy clients to interoperate with locking.

(This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033)
2007-09-07 08:52:56 +10:00
Ronnie Sahlberg
00453a375a improve the handling of hosts to notify with statd
(This used to be ctdb commit cc87bda7e344bc777b9620a6211e62de4dce4e3b)
2007-09-06 11:30:49 +10:00
Ronnie Sahlberg
46eecfea27 we dont use 'sendip' any more so dont check for it and exit from the
61.nfstickles script if it is missing from the host

(This used to be ctdb commit 8eac441e24f4ef33b55f9eaa4856b5c1e1c15213)
2007-09-05 15:39:51 +10:00
Ronnie Sahlberg
12ebb74838 change how we do public addresses and takeover so that we can have
multiple public addresses spread across multiple interfaces on each 
node.

this is a massive patch since we have previously made the assumtion that 
we only have one public address per node.

get rid of the public_interface argument.  the public addresses file 
now explicitely lists which interface the address belongs to

(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
2007-09-04 09:50:07 +10:00
Ronnie Sahlberg
4e61e05f49 when we start 60.nfs we must make sure that the shared storage
nfs-state directory actually exists (by creating it)
or else the lock manager will not start 

(This used to be ctdb commit f2d15d04df842538c8d8331796a3c6fbe23463f2)
2007-08-30 15:27:45 +10:00
Ronnie Sahlberg
1ee8c79db7 start winbind before smbd
(This used to be ctdb commit d6a2e22a6d688cfcf5631c8de68fc8ef721635d6)
2007-08-16 11:34:35 +10:00
Ronnie Sahlberg
ce91401724 we should start winbindd before we start smb
(This used to be ctdb commit 03aad3ea55c4816a3790ac9336026b4872a65310)
2007-08-16 11:18:16 +10:00
Ronnie Sahlberg
3b9d50f3ee change the now rather small /etc/ctdb/events script into a service
specific script /etc/ctdb/events.d/00.ctdb

get rid of CTDB_EVENTS_SCRIPT and --event-script

(This used to be ctdb commit 81ccfaf838e5772d4a58eb6a70224b7b39aba9f3)
2007-08-15 15:01:31 +10:00
Ronnie Sahlberg
4023576e50 call the service specific event scripts directly from the forked child
instead for from /etc/ctdb/events so that we can get better debugging 
output in the logs when something fails in the scripts

(This used to be ctdb commit 4ed96b768aea1611e8002f7095d3c4d12ccf77a3)
2007-08-15 14:44:03 +10:00
Ronnie Sahlberg
1fa787e667 fix typo
(This used to be ctdb commit c7a8e7b506f98240c0e9f705fe1f504a6a56a332)
2007-08-15 11:38:27 +10:00
Ronnie Sahlberg
83dbfecad7 add a description on how the event scripts works to the README and make
sure it is installed in /etc/ctdb/events.d

(This used to be ctdb commit adec62a924af5bb023f346e705515b09dbe64f21)
2007-08-15 11:36:01 +10:00
Ronnie Sahlberg
8b58fe2489 do not restart lockd/statd when we takeover an ip address this is
overkill since
1, we now kill the tcpconnections for lockd in 60.nfs
2, rpc.statd on linux sends out the notifications using the wrong 
interface anyway  which breaks a lot of clients  including linux !



use our own smnotify tool instead of sm-notify

(This used to be ctdb commit 0163ad0ec01be6189a98ea91e5cec40f6750218f)
2007-08-04 11:23:04 +10:00
Andrew Tridgell
fb22d3bd2c merged from ronnie
(This used to be ctdb commit 765b07fa5d1af07c8c7212d19d8e9574060b3039)
2007-07-18 20:13:57 +10:00
Ronnie Sahlberg
7e532f8f83 we dont do nfstickles unless ctdb manages nfs
(This used to be ctdb commit 0622b4a969abdc8bd11f200ed5ae1c7b1d188db7)
2007-07-15 11:43:11 +10:00
Ronnie Sahlberg
643b87fbae fix bug introduced in previous commit
(This used to be ctdb commit 8396a7500225c90165ebcfbdc2c65673740e6b25)
2007-07-15 11:37:22 +10:00
Ronnie Sahlberg
e96f733052 there is no point in doing anything in 10.interfaces unless we have a
public interface

(This used to be ctdb commit c0335ee92b16a1e2dfcb37a39872b66a35b0ab94)
2007-07-15 11:28:53 +10:00
Ronnie Sahlberg
8e89b27098 try netstat as a last attempt to check a tcp port in
ctdb_check_tcp_ports() as well

(This used to be ctdb commit ad0292726f9cfc8afe3733b30ac2d5621e9a48f1)
2007-07-15 09:29:08 +10:00
Ronnie Sahlberg
4c276ded1f if we dont have nc or netcat, try using netstat as a final attempt to
check for tcp ports

(the check for these tools should not really use hardcoded paths)

(This used to be ctdb commit 56d77082c07a519dd3804cc24cc7ba889b8469ff)
2007-07-15 09:26:54 +10:00
Ronnie Sahlberg
3890fde07f if we dont have /etc/sysconfig and we dont have /etc/default
check /etc/ctdb/sysconfig as a last option

(This used to be ctdb commit 1043929ceb0cd04ab6466e9a5d7d52f9af1cb8e8)
2007-07-15 09:13:50 +10:00
Ronnie Sahlberg
82824e0680 when we have found that /etc/rc.d/init.d/SERVICE exists, then run that
script and not /etc/rc.d/SERVICE

(This used to be ctdb commit 7f0c3a02ef11fd19c8cd5116fd451ebd10ba5d1b)
2007-07-15 08:54:48 +10:00
Andrew Tridgell
1e14ecd176 - merge from ronnie
- cleaner handling of system capture socket

(This used to be ctdb commit d194a41a71b8466d0726dcbae3970a86386fcb3c)
2007-07-13 11:31:18 +10:00
Andrew Tridgell
0becf46deb allow extra option override in /etc/sysconfig/ctdb
(This used to be ctdb commit f46fae64263ea4776e4bbf9cf14dff17b5b68ddb)
2007-07-13 09:14:15 +10:00
Andrew Tridgell
fc73bc5c24 added --nosetsched option to ctdbd
(This used to be ctdb commit 4cbbb88c1735c7d112e751e22da1c1c69e09bf4a)
2007-07-13 08:47:02 +10:00
Ronnie Sahlberg
4b6d9485ab ctdb killtcp no longer takes a <numrst> argument to control how many
times to try the reset.

the reset retry attempt is now handled inside the daemon

update the 60.nfs script and remove this parameter that is no longer 
used

(This used to be ctdb commit 30fb09b8b9a989e5cfe86b6daf2dcd2487013344)
2007-07-12 08:31:56 +10:00
Ronnie Sahlberg
ed1a52b293 use the socketkiller to kill off all lock manager sessions as well
(This used to be ctdb commit 980b090001ed3a77001e2a3bfc1b03833498f434)
2007-07-10 13:09:35 +10:00
Ronnie Sahlberg
d81bca2072 make it possible to specify how many times ctdb killtcp will try to RST
the tcp connection

change the 60.nfs script to run ctdb killtcp in the foreground so we 
dont get lots of these running in parallel when there are a lot of tcp 
connections to rst

(This used to be ctdb commit d81616214752882242f2886e94681972a790db80)
2007-07-10 10:24:20 +10:00
Ronnie Sahlberg
1c32f65ee0 run the ctdb killtcp in the background
(This used to be ctdb commit d6a514c2b3d427099ed669eef104146608378fa8)
2007-07-10 10:07:26 +10:00
Ronnie Sahlberg
dbc66d054b dont restart the tcp service after a ip takeover, it is more efficient
to just kill off the tcp connections

(This used to be ctdb commit bc481c3f1a44c50648488c4f8a7f15ec395d446f)
2007-07-10 09:45:14 +10:00
Ronnie Sahlberg
34e2c73020 use 'ctdb tickle' instead of sendip to tickle nfs clients.
(This used to be ctdb commit 2204cc77ce6b1dd6bb0118f57cfa05f0c8826c3e)
2007-07-06 11:51:34 +10:00
Ronnie Sahlberg
72265dd5bd remove 59.nfslock and fold this into 60.nfs
add a 61.nfstickle script to make nfs failover faster

(This used to be ctdb commit da71fa874d49346d229307d424f889994a205c89)
2007-07-06 10:54:42 +10:00
Andrew Tridgell
f532ada445 run smbstatus every 10 minutes to scrub databases
(This used to be ctdb commit cd119cdb9a1a7e0545f1c33a2a156a3d3c5d7645)
2007-06-18 03:15:08 +10:00
Andrew Tridgell
669a6b13f9 merge from ronnie
(This used to be ctdb commit 7bfc1be6dff5bd5acadfa8a3fd8f00a8ce87ca54)
2007-06-18 03:10:50 +10:00
Ronnie Sahlberg
d2ada57f60 add a mechanism to the samba event script to do periodic cleanup of the
databases once every 60 minutes

(This used to be ctdb commit 8762e08284343bf68bfed90838483e5d53db24ce)
2007-06-18 02:34:29 +10:00
Andrew Tridgell
732353de5f - merged ctdb_store test from ronnie
- added DatabaseHashSize tunable
- added logging of events inside recovery (for timing)

(This used to be ctdb commit 3593cdb928b91e217faf1b3c537fa28dc82cdace)
2007-06-17 23:31:44 +10:00
Andrew Tridgell
9d0a595594 check winbind in monitoring event too
(This used to be ctdb commit bccba656c21d0edbd9840401a3c43a76b1b3bc05)
2007-06-17 12:05:29 +10:00
Andrew Tridgell
d683080b08 - wait for winbind on samba start
- use $PATH for ctdb status

(This used to be ctdb commit cf8d837cead1cbcb22c71ebbc3947970d1a565a3)
2007-06-17 11:57:42 +10:00
Ronnie Sahlberg
741af6a468 note that there is no link on the PUBLIC interface
(This used to be ctdb commit 3582f12f837dbd3c866cdffd2e7f5c20bae59d10)
2007-06-14 17:26:42 +10:00
Andrew Tridgell
8120da0e9d fixed testparm calls
(This used to be ctdb commit 0835abffc0caa2a04cb717a636e77c71355f3c80)
2007-06-11 13:56:50 +10:00
Andrew Tridgell
2703ba210d merge from ronnie
(This used to be ctdb commit 1a0bd55dd27939110385e00dad73726a8ba66747)
2007-06-11 09:43:23 +10:00
Ronnie Sahlberg
47edceec09 when public interface is not set, print this to the logfile before
exiting the script

(This used to be ctdb commit 79f4a9faea7583aad6f39733d019ba416a4be6e5)
2007-06-11 08:42:51 +10:00
Andrew Tridgell
4e0b95ec9c newer versions of ip need the mask on del
(This used to be ctdb commit b5b13125506256f9ef6599498ee046e73b52df66)
2007-06-09 21:46:42 +10:00
Andrew Tridgell
d1c225a0b9 disable a node if testparm thinks there is a error, or warning, or an unrecognised option
(This used to be ctdb commit ded80c83002a267996b4616e3702988b821cd422)
2007-06-06 19:46:25 +10:00
Andrew Tridgell
76b7361c7e - added monitoring of rpc ports for nfs, and of Samba ports and directories
- added monitoring of the ethernet link state

When monitoring detects an error, the node loses its public IP address

(This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501)
2007-06-06 12:08:42 +10:00
Andrew Tridgell
b4f764c269 fixed error handling in event scripts
(This used to be ctdb commit d645c8b0480e7d2765614a226d78510e100016de)
2007-06-06 11:27:06 +10:00
Ronnie Sahlberg
317dec2f9e merge from tridge
(This used to be ctdb commit 5f1f889e0e124c5275463795c004ae971945e1ae)
2007-06-05 18:16:45 +10:00
Ronnie Sahlberg
96a12cc4ab add a simple events script to manage vsftpd
(This used to be ctdb commit 413efc7af529e4ebda6f7ea6e36f79ba72a2d1d9)
2007-06-05 18:14:01 +10:00
Andrew Tridgell
ac55bc4166 first step in health monitoring of cluster nodes. When not healthy they will be marked disabled
(This used to be ctdb commit d3dbd9fc4db21632075b56fc52cf95435c63374a)
2007-06-05 17:43:19 +10:00
Andrew Tridgell
ee747b5bd6 set close on exec on pipe in event scripts, so long running scripts don't hold the pipe
(This used to be ctdb commit 22662614b4091a4e4282e63d6876097cbf3e3d6e)
2007-06-05 15:18:37 +10:00
Ronnie Sahlberg
32d19d3791 dont use CTDB_MANAGES_NFS for controlling the lockmanager
use a dedicated variable CTDB_MANAGES_NFSLOCK   since some might want to 
use nfs but no lockmanager

(This used to be ctdb commit 1e8cec86617ffb188bd49c70f074a4b350d3fe3d)
2007-06-05 12:43:35 +10:00
Andrew Tridgell
7498d3c55d explain event types
(This used to be ctdb commit 551472b78b025d9446ee58420dcec70c600555d0)
2007-06-04 23:54:46 +10:00
Andrew Tridgell
bd81cc521d ignore commented out entries in /etc/exports
(This used to be ctdb commit d316b49ba46e819359f045adfd87da92860fd1b5)
2007-06-04 23:54:22 +10:00
Andrew Tridgell
fcce534f23 allow setting of variables at startup in config file
(This used to be ctdb commit db39ca7c0ee1441113ac3279cb75b3cb38eecd1b)
2007-06-04 20:05:31 +10:00
Andrew Tridgell
dbb2ec43dd added tunables settable using ctdb command line tool
(This used to be ctdb commit 73d440f8cb19373cfad7a2f0f0ca4f963c57ff29)
2007-06-04 19:53:19 +10:00
Andrew Tridgell
62b30e478d make sure we don't have any namespace collision problems with config variables
(This used to be ctdb commit dde9024b25fe12cf25c059e5accb3ca21838b130)
2007-06-04 15:44:52 +10:00
Andrew Tridgell
cc9f6d30d8 split out the basic interface handling, and run event scripts in a deterministic order
(This used to be ctdb commit 399e993a4a233a5953e1e7264141e5c7c8c8c711)
2007-06-04 15:09:03 +10:00
Andrew Tridgell
73e626bc6b automatically bring up interfaces that we manage. This allows ctdb to work without requiring two IPs per public interface
(This used to be ctdb commit 221850dcf9c28698eb3a1baf33cbf7f9137ac502)
2007-06-04 14:16:51 +10:00
Andrew Tridgell
837fb236b9 handle NETWORKING var not existing
(This used to be ctdb commit f8cf9f81e8f81818dc141eda5419c2749a0652a4)
2007-06-03 22:11:48 +10:00
Andrew Tridgell
e763874872 make the init scripts more portable about location of system config files
(This used to be ctdb commit 65f3e2bc722e314b2c51c3bfdc544b408a8a64cf)
2007-06-03 22:07:07 +10:00
Ronnie Sahlberg
dac3f7d23c ubuntu uses a different style of init scripts than redhat and suse
(This used to be ctdb commit 6d3bee5d1a7dd6718045c673cfd150e3207ea970)
2007-06-03 19:24:52 +10:00
Andrew Tridgell
b9973e1d3e more portability tweaks in the init script
(This used to be ctdb commit 83a1c79e95af93a9ccfe78556ac5692c0315a3e4)
2007-06-03 17:53:26 +10:00
Andrew Tridgell
b4542aa00a don't start nfs services unless the relevant directories are available
(This used to be ctdb commit e0468d61119b6581f5ec458641568d03714a5786)
2007-06-03 14:39:27 +10:00
Andrew Tridgell
ee3ce951ce do a full restart in init cron call
(This used to be ctdb commit ed181dce8f307bd8f36de42351d04f39b2396836)
2007-06-03 10:29:57 +10:00
Andrew Tridgell
a795986baa docs on how to use statd-callout
(This used to be ctdb commit 4a75111b4f3f93dc42c9ced2d23f3cc933712017)
2007-06-02 19:45:06 +10:00
Andrew Tridgell
794d6dd59d move config files to config/ directory
(This used to be ctdb commit f95de519b885c8e1f40df0cda70fd796e479a22a)
2007-06-02 19:40:07 +10:00