1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-26 10:04:02 +03:00

430 Commits

Author SHA1 Message Date
Ronnie Sahlberg
a453e79050 50.samba : Tell winbind about every time we add/remove and ip from the node
CQ S1021636

(This used to be ctdb commit 87b279027616cffbcedfd534ac0032cd51238dfe)
2011-02-18 11:29:35 +11:00
Ronnie Sahlberg
d32a4dd501 remove checking for filesystems and filesystem health from the cnfs script.
remove the gpfsmount and gpfsumount entry points

(This used to be ctdb commit 7db5a4832a9555be53c301f198f72b9e075a8ae7)
2011-02-18 10:11:56 +11:00
Ronnie Sahlberg
ef0ab7eee1 60.nfs
Dont update the statd settings that often.
When we have very many nodes and very many ips, this would generate
a lot of unnessecary load on the system

(This used to be ctdb commit 0c030c9384500f340d8382c20e1e91b11aa377e9)
2011-02-18 10:10:34 +11:00
Martin Schwenke
59c5a9f279 Eventscripts: lower the fail/restart limits for nfsd.
We were potentially leaving a node unable to serve requests for too
long.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5be8610ffa33db49e33949560d0ef2fa5f3c0c73)
2011-01-11 16:49:46 +11:00
Martin Schwenke
96378d6dc8 Eventscripts: use "startstop_nfs restart" to reconfigure NFS.
This was defaulting to just "service nfs restart", which doesn't have
the workarounds we need.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 0f462e9e9fe12b595f3c7452123db8e69548abd6)
2011-01-11 16:49:14 +11:00
Martin Schwenke
3efd5ef77c Eventscripts: only autostart during a monitor event.
Otherwise we might short-circuit events that are run only once and
actually need to do something.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c4f9e8a43540bc049b2771e0a2d76d37b9d17331)
2011-01-11 16:48:50 +11:00
Martin Schwenke
fb8f199651 Eventscripts: print a message when reconfiguring a service.
Otherwise there can be strange error messages from services
stopping/starting, without any context.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 8bcf7ab164429ddc0ae530133e114f186a8146dd)
2011-01-11 16:48:17 +11:00
Martin Schwenke
934ae76d38 Eventscripts: work around NFS restart failure under load.
"service nfs restart" can fail.  To stop nfsd it sends a SIGINT and
nfsd might take a while to process it if the system is loaded.
Starting nfsd may then fail because resources are still in use.

This does some /proc magic to tell nfsd to do no more processing.  It
then runs service stop, kills nfsd with SIGKILL, and then runs service
start.  This is much less likely to fail.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a9bf4f82852975b0b627f61ceb2d23401f630805)
2011-01-11 16:47:43 +11:00
Ronnie Sahlberg
47aad74673 TYPO
(This used to be ctdb commit 38dc1ac2e87416a22c9356596286b773d601e71c)
2011-01-11 16:17:33 +11:00
Ronnie Sahlberg
2a3442d972 STATD is 100027 not 1000247
(This used to be ctdb commit f4cf15a2b06ffefde0cba803603b48040ad0fa05)
2011-01-11 16:16:28 +11:00
Ronnie Sahlberg
7e747aab8d 60.nfs Check if we have rpc.statd and if not, skip checking for statd
availability at all (since we cant restart it, there is not point checking
if it is alive)

(This used to be ctdb commit 6075e85ba6c0f58fd1ab2ce3b09dd3d6ff491365)
2011-01-06 15:49:15 +11:00
Ronnie Sahlberg
ded7c23122 41.HTTPD
Httpd can be very slow to start on some platforms,
wait 5 monitor intervals before we try to restart it if
it has not bound to port 80 yet.
After 10 failed intervals, flag the node as unhealthy.

(This used to be ctdb commit 6ec1993aa5f2778b8227ce5f6eca0d19e4ae9788)
2010-12-22 10:31:41 +11:00
Ronnie Sahlberg
e9ff38be7d 60.nfs
Try to restart LOCKD after 10 failures and
flag the node as unhealthy after 15 failures

(This used to be ctdb commit 5a67889c9166835aef3443051812d14af07dfca5)
2010-12-22 10:31:31 +11:00
Ronnie Sahlberg
57e74f6d8a Dont run net serverid wipe in the background
(This used to be ctdb commit 76c515f9f05f4fb5683b5ff65cf136c168fd882f)
2010-12-22 10:31:26 +11:00
Ronnie Sahlberg
97a6eccaf7 50.samba
Net serverid wipe can take a bit of time sometimes so background it.

Only perform auto start/stop of the managed service on the monitor event

(This used to be ctdb commit deba5cbbf7703a1a24ce88a06c73fca056e05521)
2010-12-14 21:19:28 +11:00
Ronnie Sahlberg
1e41ab5fa3 LVS
update lvs configuration on ipreallocated events too

(This used to be ctdb commit a4e98073d955676fdcbb91affae1de1a733d0bc2)
2010-12-13 14:24:16 +11:00
Ronnie Sahlberg
c26c6a01cf only run "serverid wipe" if we are actually running samba.
we dont need to run this on systems where we do run winbind but not samba

(This used to be ctdb commit fcb9e8d1e1c78439ea42adb8b05ad84fbca7f724)
2010-12-10 13:42:12 +11:00
Ronnie Sahlberg
8147d29598 add a missing part of the import of the previous ganesha patch
(This used to be ctdb commit 171b8855bb2feae7f7dd6a079571f3113dedd6f4)
2010-12-06 11:50:15 +11:00
Chandra Seetharaman
5e485d5ca0 make changes to ctdb event scripts to support NFS-Ganesha.
make changes to ctdb event scripts to support NFS-Ganesha.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>

(This used to be ctdb commit 7298588ed54492f106954c893dd86b0a36783470)
2010-12-06 11:50:12 +11:00
Ronnie Sahlberg
8959c8e850 dont try starting samba through the "init" event
(This used to be ctdb commit e314a449606418a4c4eac6eb319bfcdf1c398cd3)
2010-12-03 11:40:38 +11:00
Ronnie Sahlberg
6ed0009125 When we are no longer the natgw master, dont put the natgw ip on loopback.
We put the ip on loopback just to make sure we would still interoperate with
non-standard configurations on unix-KDC, that are configured to verify the optional
HostAddresses field.
This is not required for AD, since AD does not use this field, and is replaced in
unix land with other/better mechanisms than this "dodgy" check.

This makes it "easier" for applications that have bound to the natgw address
to detect a socket problem and try to reconnect/recover if the ip address
is completely missing from the system.

At the same time, use the winbind specific hook that exists to explicitely tell winbindd : this address is gone, so if you have bound to it, this is a good time to close and rebind your socket.

cq 1020333

(This used to be ctdb commit 0da94869d2912b2a412ba3fbd2137d88ce4e4389)
2010-11-29 12:45:59 +11:00
Ronnie Sahlberg
ebcc866ae0 update autostart/stop to work for samba
(This used to be ctdb commit 37ab57e2adaecc3f7996ea20af45a5df0cd8be76)
2010-11-22 20:42:26 +11:00
Ronnie Sahlberg
a3e7dfadca add an explicit _is_managed_service to iscsi eventscript
(This used to be ctdb commit 44f683a1ba15944d3306a0effd572de3280ff975)
2010-11-18 14:15:56 +11:00
Ronnie Sahlberg
193d9d50d1 Dont pollute the logs with a "file not found" message
CQ S1020745

(This used to be ctdb commit ea8bb7b26bb879a895c267d49672433182390d0d)
2010-11-18 13:54:15 +11:00
Martin Schwenke
c00db6f271 60.nfs eventscript should do nothing if NFS isn't managed by CTDB.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 582e5cd077501e8d4131a9c7981781471308edfd)
2010-11-18 13:36:40 +11:00
Martin Schwenke
a2af87482b Eventscript functions - catch failures in ctdb_service_start().
ctdb_service_start() currently succeeds if ctdb_counter_init()
succeeds.

This changes it to fail when a service start fails.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ddb73962d72d933bf0edc28be0dbb45bea7e5ef4)
2010-11-18 12:15:05 +11:00
Martin Schwenke
3ab768e8d4 50.samba eventscript should stop/start services when they become (un)managed.
When the value of $CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND (or
corresponding changes are made to $CTDB_MANAGED_VERSIONS), the
associated service should be started or stopped as necessary.

This add calls to ctdb_start_stop_service() to manage
starting/stopping samba and winbind.

An associated cleanup is made to the initial checks that one of
$CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND is set, replacing them
with calls to is_ctdb_managed_service().

To handle the winbind cases ctdb_start_stop_service() and
is_ctdb_managed_service() are updated to take an optional service name
parameter.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d98f175e8420d921a123ae9c0ce00945350b1537)
2010-11-18 12:12:30 +11:00
Ronnie Sahlberg
4fe85e5be5 add a new support function ctdb_check_counter_equal()
update nfs to try to restart the service after 10 consecutive failures
and to flag the node unhealthy after 15

add similar function to mountd

(This used to be ctdb commit 1569a54bb82fc433895ed68f816cf48399ad9d40)
2010-11-17 13:54:57 +11:00
Martin Schwenke
8fe1ec3754 Eventscripts: make loadconfig() function hookable by the test suite.
Rename loadconfig() to _loadconfig().  Add a new loadconfig() that
simply calls _loadconfig().

This makes it easy for the test suite to override loadconfig().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 1d77a3adfff893b3c01b87f791e72c0d3148425c)
2010-11-17 11:46:48 +11:00
Martin Schwenke
e23ca7dba5 Make a time comparison in 60.nfs eventscript more readable.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 26077e6c8eb126584af587e7416154ea4858aea2)
2010-11-17 11:44:26 +11:00
Martin Schwenke
6ab5ae2c9b 60.nfs only fails or warns after 10 consecutive nfsd/statd failures.
These failures are sometimes the result of slow restarts so we want to
avoid dirtying the logs or marking a node unhealthy because of them,
unless they are excessive.

For these 2 cases we use the existing fail counting code but hack a
temporary service_name in a subshell to allow separate fail counts.

We also update ctdb_check_rpc() so that it captures the error output
from rpcinfo and we add a message including the service name to the
beginning.  The error is printed to stdout but is also stored in
ctdb_check_rpc_out to allow it to be conditionally used by the caller.
This function also now returns non-zero rather than exiting on
failure.

Other direct rpcinfo calls are relaced by called to ctdb_check_rpc()
for consistency.

Option handling code for service restarts is cleaned up so that fits
in 80 columns.  A more informative restart messageis now used in all
cases, printing the exact command being used to start a service.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 79c25fe241cf5d8f92e23d3736823ebaf4e1769d)
2010-11-17 11:43:09 +11:00
Ronnie Sahlberg
055eafb790 this stuff is just so fragile that it will enter infinite recovery and fail loops
on any kind of tiny unexpected error

unconditionally try to remove ip addresses from both old and new interface
before trying to add it to the new interface to make it less
fragile

(This used to be ctdb commit 80acca2c91c9053c799365bae918db7ed8bdc56f)
2010-11-10 14:55:25 +11:00
Ronnie Sahlberg
ebed26d755 delete from old interface before adding to new interface
this stops the script from failing with an error if
both interfaces are specified as the same, which otherwise breaks and leads to an infinite recovery loop

(This used to be ctdb commit 565de03a784ed441490f8cd0b137b5cec8716d55)
2010-11-10 14:55:25 +11:00
Ronnie Sahlberg
76578b9533 dont delete all ips from the system during the initial "init" event
leave any ips as they are and let the recovery daemon remove them as required

(This used to be ctdb commit 8ab311719857847b4cf327507b0af1793551e73c)
2010-11-10 14:55:23 +11:00
Ronnie Sahlberg
a1cfa23d60 Both nfs and nfslock scripts can fail under redhat in very rare situations.
Ctdb can also be configured to ignore checking for knfsd and if it is alive.
In that situation, no attempt will be made to restart nfs, and sicne nfs is not running,  lockd can not be restarted either.

To workaround this, everytime we try to restart the lockmanager, also try to restart nfsd

(This used to be ctdb commit 953dbfbddad656a64e30a6aca115cb1479d11573)
2010-10-28 13:45:40 +11:00
Ronnie Sahlberg
0d75856bb7 When shuttind down, we always unconditionally try to remove the natgw address
even if we are not currently the natgw master.
This adds extra reliability in case we have stopped previously without removing it proper,
but does add spam messages to syslog everytime we shutdowm.

Remove these spam messages from pulluting the syslog upon normal shutdown

(This used to be ctdb commit cd84da6f247ee46bbab8318298d1cd3cfc87aba9)
2010-10-28 13:38:07 +11:00
Ronnie Sahlberg
14c8228292 Redirect the output from 00.ctdb pfetch to stdout.
Normally, the config.tdb database would not exist, so we do not need
to spam syslog with a "config.tdb does not exist" message every time we start ctdb

(This used to be ctdb commit 5792809b72e534161c5ca9ef5c9897abcb3b899c)
2010-10-28 13:35:55 +11:00
Stefan Metzmacher
ab6beb6b7f events.d/11.routing: handle "updateip" event
metze

(This used to be ctdb commit 034635418c7e5274d6bdf4cccc7a10e3b631e2d4)
2010-10-21 11:09:46 +11:00
Ronnie Sahlberg
b4e3a95039 try to restart NFS LOCKD if it failed to start
(This used to be ctdb commit 2913cc93a9a172caf9e0d6675cfa4de4cc957b13)
2010-10-14 08:13:09 +11:00
Ronnie Sahlberg
0de79c12ba Make sure the statd directory exist before trying to access the
"update trigger" file.

CQ 1020344

(This used to be ctdb commit 171f98f6f7ce7d01f47c44043ad599702711b12d)
2010-10-12 08:02:18 +11:00
Ronnie Sahlberg
842d9aab4e move extracting the config from config.tdb for public addresses
into its own function

(This used to be ctdb commit 2d478a39ed8303b0371112d61630660d12b7db2c)
2010-10-12 02:57:53 +11:00
Ronnie Sahlberg
f7febd28af dont stop checking interfaces after the first bond device
continue the loop to process all other interfaces too

(This used to be ctdb commit 500ade4e6a58ea786a665f6be7cf30f43c882570)
2010-10-09 10:55:43 +11:00
Ronnie Sahlberg
51a38dc4a4 Spotted by rusty.
Add a missing $
so we delete $_ip   and not _ip

(This used to be ctdb commit e9d04c5f419eaa0338a3beefba32c52be00242a8)
2010-10-08 15:53:36 +11:00
Ronnie Sahlberg
f5c0539dc6 Change how NATGW is configured to allow special nodes that do not have
network connectivity outside of the cluster to still be able to
participate in a natgw group.
These nodes can not become natgw master since they lack external network
connectivity.

These nodes are configured just the same way as for any other node with
NATGW, with the following two exceptions :
* we do NOT set CTDB_NATGW_PUBLIC_IFACE at all on these nodes.
  since these ndoes lack external network we should not check the interface
  for link.
* we must set CTDB_NATGW_SLAVE_ONLY=yes to flag that this is a node that
  can not become natgw master.

(This used to be ctdb commit ab7b00a37e55beffc074be95b55d8a5c7cb9eef2)
2010-09-08 09:20:16 +10:00
Ronnie Sahlberg
dc2f87737d Dont store temporary runtime data in $CTDB_BASE/state
since that will usually be /etc/ctdb/state and storing this under /etc is just
wrong.

Add a new variable CTDB_VARDIR that defaults to /var/ctdb and store the data there instead.

(This used to be ctdb commit 516423c25afa9861d9988096efa8a4a2b12b31b1)
2010-09-03 12:43:28 +10:00
Ronnie Sahlberg
c7df27e32d make sure all statd state directories exist before we try to reference them
or else tar and friends will throw an error in the log

(This used to be ctdb commit 96cbd2c0aa9a4641a42b3c33374675fa732ed1e5)
2010-09-01 15:49:57 +10:00
Ronnie Sahlberg
8be5bf1567 dont print a lot of log information about shutting down vsftpd
(This used to be ctdb commit 1a41cd7332703629001201eea8ae9b94f1341c9d)
2010-09-01 13:29:38 +10:00
Ronnie Sahlberg
9ef21f1c07 ouch, remove a dummy debug printout that snuck in there somehow
(This used to be ctdb commit 14c4d99513b4bdb94f60c3e9c4823e04b0833e60)
2010-08-30 19:48:41 +10:00
Ronnie Sahlberg
2b4d9170c2 Merge commit 'martins/master'
(This used to be ctdb commit cc8c851e2e0b46f00b18a6dc61fd2774e97850dd)
2010-08-30 18:22:05 +10:00
Ronnie Sahlberg
12cc826231 Remove the dependency on the underlying cluster filesystem for handling
the clusterwide persistent data associated with the lock manager and
statd notifications.

Use persistent databases to store this data instead of a shared directory.

(This used to be ctdb commit fc0678d351187cfa4c71123f97c0f493aacd5d16)
2010-08-30 18:14:41 +10:00