1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-13 13:18:06 +03:00
Commit Graph

545 Commits

Author SHA1 Message Date
Martin Schwenke
caee6f1508 Eventscripts: fix typo in _ctdb_counter_common().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f57d1722b6aa082f3f826171acc57d7d796ea95c)
2011-08-11 10:46:56 +10:00
Martin Schwenke
ab693dbcc0 Eventscripts: improve log messages in ctdb_start_stop_service().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 6da7095192fb172a06b434cfb02f4bfa6221b343)
2011-08-11 10:46:56 +10:00
Martin Schwenke
1b956b2b0a Eventscript functions: fix counter regression.
d362be7d32079ac1390d67056ce107bfbca2c937 wasn't well thought out.
Subsequent commits depend on ctdb_counter_init() taking an argument,
so this makes those cases work.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 05a8fcfbac3da2b5843b31e0fe258255cc761190)
2011-08-11 10:46:56 +10:00
Martin Schwenke
217edfa1c8 Eventscript functions: ctdb_service_check-reconfigure() acts only on monitor.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit beabf506a5eb68fc50fdbf8772c1d2bb0f7951e3)
2011-08-11 10:46:56 +10:00
Martin Schwenke
cd4074d2f8 Eventscripts: make 50.samba use $service_state_dir.
Signed-off-by: Martin Schwenke <martin@meltin.net>

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 0f003f05e28037eefdce3a686fcb52cd2289af9d)
2011-08-11 10:46:56 +10:00
Martin Schwenke
3d1f0100be Evenscripts: update 60.nfs to use ctdb_service_check_reconfigure.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 7c070b0bc86b3b9a91a9dc263b72c0567934535c)
2011-08-11 10:46:56 +10:00
Martin Schwenke
a35138a001 Evenscripts: update 60.nfs to use ctdb_setup_service_state_dir.
The state directory basename becomes "nfs" rather than "statd".  One
line of code i moved from the "startup" event to service_start().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit cc4c5c19af7efe01c48f73bb5ec5e607ed79db4c)
2011-08-11 10:46:20 +10:00
Martin Schwenke
d6c5fcfbae Evenscripts: update 40.vsftpd to use ctdb_service_check_reconfigure.
To simplify we also remove the reconfigure from the recovered event
because the monitor event will handle this very quickly anyway.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit da3aedd1a472b430b75989d3c157efedd382e327)
2011-08-11 10:46:20 +10:00
Martin Schwenke
4daf8bb1c8 Evenscripts: update 41.httpd to use ctdb_service_check_reconfigure.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 51c45b1c4751af41e5f9fd252763e0025f8cce3a)
2011-08-11 10:46:20 +10:00
Martin Schwenke
820d9b30ea Eventscripts: rejig the reconfigure infrastructure.
* Add an optional service name argument to existing reconfigure
  functions.

* User function service_reconfigure() instead of variable
  $service_reconfigure to specify how a service is reconfigured.

* New function ctdb_service_check_reconfigure() reconfigures a service
  if it is flagged for reconfigure.

* Remove $service_reconfigure settings from 40.vsftpd and 41.httpd -
  they're the defaults.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c)
2011-08-11 10:46:20 +10:00
Martin Schwenke
5b5bd3d27b Eventscript functions: move flagging of managed services.
Move flagging of managed or unmanaged services into
ctdb_service_start() and ctdb_service_stop().  That way services will
be correctly flagged if they are started from the startup and shutdown
events.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 8675744cbd90b5a5095ed6fff7b36ae82004a457)
2011-08-11 10:46:20 +10:00
Martin Schwenke
428e32d647 Eventscript function: change service_start into a function.
service_start is currently a variable.  This makes passing arguments
hard.  We change it to be a function and put default definitions into
the functions file.

We use a convention that if a service name argument is passed to a
redefined version of service_start() or service_stop() then it will
act unconditionally.  If no argument is passed then it can use
internal logic to decide if services should really be started.  This
is useful when a single eventscript handles multiple services.

This is a cherry-pick of ae38895 that needed to be reset mid-stream.
There is still some breakage following this commit.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 86e4aefed9fd1028660c98e3ea758c2b75ffc1d8)
2011-08-11 10:46:20 +10:00
Martin Schwenke
f60802c776 Eventscript functions: add optional event name argument to fail count functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b14f18649f42aab80ce0336c15ab6159f241c9af)
2011-08-11 10:46:20 +10:00
Martin Schwenke
ea6a53e2b3 Eventscript functions - optimise is_ctdb_managed_service().
This function generates a lot of trace when running under "set -x".
This is due to the backward compatibility code.

This adds 3 optimisations:

1. Before invoking the backward compatiblity code,
   is_ctdb_managed_service() returns early if the service is listed in
   $CTDB_MANAGED_SERVICES.

2. ctdb_compat_managed_service() actually now updates
   $CTDB_MANAGED_SERVICES instead of temporary variable $t.

   This means that a subsequent call to is_ctdb_managed_service() will
   short circuit due to optimisation (1).

3. ctdb_compat_managed_service() only adds a service to
   $CTDB_MANAGED_SERVICES if it is the service being checked by
   is_ctdb_managed_service().

   This stops irrelevant services being added to
   $CTDB_MANAGED_SERVICES multiple times by multiple calls to
   is_ctdb_managed_service().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 758f4667c60089e09a0439c1eb74f5e426ca5e2e)
2011-08-11 10:46:20 +10:00
Martin Schwenke
6ec2cfc7da 50.samba eventscript should use is_ctdb_managed_service "winbind".
Currently it checks $CTDB_MANAGES_WINBIND directly in several places.
This doesn't work when someone sets $CTDB_MANAGED_SERVICES directly.

This modifies check_ctdb_manages_winbind() so that it return a
condition rather than modifying $CTDB_MANAGES_WINBIND.  This makes
some code more readable.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 538902fbc1e74134a03987b36b3733ad641f8971)
2011-08-11 10:46:20 +10:00
Martin Schwenke
e96e655430 50.samba eventscript should use is_ctdb_managed_service "samba".
Currently it checks $CTDB_MANAGES_SAMBA directly.  This doesn't work
when someone sets $CTDB_MANAGED_SERVICES directly.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d8f0f8948abd340088720718fef7dc858661ba23)
2011-08-11 10:46:20 +10:00
Martin Schwenke
45bcf843ec 50.samba eventscript should stop/start services when they become (un)managed.
When the value of $CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND (or
corresponding changes are made to $CTDB_MANAGED_VERSIONS), the
associated service should be started or stopped as necessary.

This add calls to ctdb_start_stop_service() to manage
starting/stopping samba and winbind.

An associated cleanup is made to the initial checks that one of
$CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND is set, replacing them
with calls to is_ctdb_managed_service().

To handle the winbind cases ctdb_start_stop_service() and
is_ctdb_managed_service() are updated to take an optional service name
parameter.

Signed-off-by: Martin Schwenke <martin@meltin.net>

Conflicts:

	config/events.d/50.samba

	Most of this merged elsewhere.  This just removes a check that
	this is the monitor event.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 257a2e350280c0b76ed2fac588cad167381fda52)
2011-08-11 10:46:20 +10:00
Ronnie Sahlberg
21226ee738 Add documentation for the new filesystem use monitoring
(This used to be ctdb commit 9f10c5d48a08ffb3417f880c801aed2aa2dc1355)
2011-08-11 10:07:50 +10:00
Ronnie Sahlberg
ee96db07d5 Add new eventscript 40.fs_use that can be used to monitor file system use and flag a node unhealthy when they become full
(This used to be ctdb commit 2fd1babf8135ad5d53f3b25ba823d840ebc66460)
2011-08-11 10:04:40 +10:00
Ronnie Sahlberg
c8a18e8f9a make the persistent even longer for lvs to make people even happier
(This used to be ctdb commit 8158077624eb763ba40c6a7b4b7faf3867b205d7)
2011-08-11 09:12:38 +10:00
Ronnie Sahlberg
543701293f increase the persistent timeout to make people happier
(This used to be ctdb commit 68ea19cb02017e93769df7f6312d5e0bef55e605)
2011-08-11 07:14:57 +10:00
Ronnie Sahlberg
f9156adef5 check the shares if they are available before we decide to try to restart nfs
CQ S1027529

(This used to be ctdb commit b6c6a4588ccf6ef78fabfd76d228f56b4eb65165)
2011-08-11 07:14:16 +10:00
Martin Schwenke
4e60075228 Eventscripts - fix 10.interface bash incompatibility.
In dash, this fails gracefully with nothing to stderr:

  t=$(cat /does_not_exist) 2>/dev/null

In bash the error from cat is still printed due to different order of
evaluation.

This works everywhere:

  t=$(cat /does_not_exist 2>/dev/null)

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a6e61867c7a58d5a77cd8641d8df0b105cddff77)
2011-08-10 16:06:26 +10:00
Martin Schwenke
06f1004da4 Merge branch 'eventscript.20.multipathd' into eventscript.00.ctdb
(This used to be ctdb commit 8723b88b0b2bbeece38c74c77c50e8d8b3e2d5ca)
2011-08-10 15:32:58 +10:00
Martin Schwenke
383b203096 Merge branch 'eventscript.62.cnfs' into eventscript.20.multipathd
(This used to be ctdb commit fb87fa9273db4f82e801a331b5d95059d64dfb8e)
2011-08-10 15:32:11 +10:00
Martin Schwenke
7eae4aafca Merge branch 'eventscript.13.per_ip_routing' into eventscript.62.cnfs
(This used to be ctdb commit cfa4102ec0d97e1d1d3c1ce6407ffacdb85c2e10)
2011-08-10 15:31:13 +10:00
Martin Schwenke
098da255fa Evenscripts: update 61.cnfs to use ctdb_setup_service_state_dir.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit afafeb1fb12384bddff470d38b534f513a1f3b07)
2011-08-10 12:27:41 +10:00
Martin Schwenke
061b7adad6 Evenscripts: update 13.per_ip_routing to use ctdb_setup_service_state_dir.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 18e0236754507a9475653f04bb239c5d46ba51de)
2011-08-09 17:35:37 +10:00
Martin Schwenke
609a1e5c77 Evenscripts: update 20.multipathd to use ctdb_setup_service_state_dir.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 797ca65bdd59b14325ffd32b4d4140e9b01dbe71)
2011-08-09 17:28:09 +10:00
Martin Schwenke
f36bae1cbf Eventscripts: fix dangerous rm -rf in 00.ctdb init event.
Also remove some unnecessary absolute paths for commands, which were
making the code slightly difficult to read.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 1b3f2dd62efb240f8486016fe0f8dfb73d6ccc66)
2011-08-09 16:48:57 +10:00
Martin Schwenke
dd56cde3ff Eventscripts: 00.ctdb uses $service_state_dir, neaten update_config_from_tdb().
This also fixes a bug where update_config_from_tdb() used an incorrect
filename in one place.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a5ce2adaa39f077f56582072a97bb64d0eba4b4d)
2011-08-09 16:45:50 +10:00
Martin Schwenke
cbf030a72e 00.ctdb eventscript removes all files from $ctdb_active_dir.
Without this you can get into a situation where ctdbd can not start.
If the active file for a service exists but the service is not
running, then trying to stop the service may fail, causing the
eventscript to exit from ctdb_start_stop_service().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 28379ca0f747c5952d690a451834ce7421adfd34)
2011-08-09 16:42:27 +10:00
Martin Schwenke
71e9016ec2 Scripts: add note about not using absolute command paths to README.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 87e6a4a23a6ae6c276e9628ce513663f47b4ee77)
2011-08-09 16:36:37 +10:00
Martin Schwenke
d81c1319e9 Add a README to the config/ subdirectory.
This includes a comment about using POSIX Bourne shell, including a
suggestion not to use "local" variables.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5ae002c7513b1b2aa5136437a1a19f8cd179b869)
2011-08-09 16:36:37 +10:00
Martin Schwenke
ee38b9a159 Eventscript functions: new function ctdb_setup_service_state_dir().
To be used by eventscripts to create a per-service directory for their
own state data.  $service_state_dir is set to point to the new
directory.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a273554791c2a5281aee28f8e2be0c514e14c91e)
2011-08-09 16:35:07 +10:00
Martin Schwenke
ec33c04283 Eventscript functions: new functions to remember/check if service managed.
This was done ad hoc and was badly named.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9a084a121f629b2c1bcefc1e4c4a4a5cacf53987)
2011-08-09 16:20:08 +10:00
Martin Schwenke
50dc5b01a4 Scripts: remove absolute paths from interface_modify.sh.
The "ip" command is currently run as "/sbin/ip".  This makes it
impossible to replace with a stub in unit testing.  The functions file
controls $PATH, so we don't need absolute paths.

This replaces the absolute paths...

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5b4c712aab3edc0059f2e5a6730b7fdcf7e5f4ec)
2011-08-08 15:50:10 +10:00
Martin Schwenke
eec654314a Eventscripts - Remove local variable usage in 10.interfaces.
POSIX sh doesn't have local variables.  Debian's dash doesn't behave
the same way as bash on this contruct:

  local var=`command that produces multiple words`

It only assigns the 1st word and may print an error.

Just remove the use of the "local" keyword in monitor_interfaces() to
solve this.  It isn't actually limiting the scope of any variables
that are used outside the function.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 95d9a1e19655461288a2c7e52abf9d01ab23e05a)
2011-08-08 15:44:30 +10:00
Martin Schwenke
72362e7b56 Eventscripts: source a file specified by $CTDB_RC_LOCAL in functions file.
Another unit testing hook.  This is easier than dropping files into
rc.local.d/ and then removing them.

The file has to be executable.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b13ac3bdaf326a6cdfd87da9195eb9630806c418)
2011-08-08 13:51:32 +10:00
Martin Schwenke
394bbe8454 Eventscript functions - use $CTDB_VARDIR instead of local $ctdb_spool_dir.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d0c6d9b19f0dd8946f9504b0d1cf50dd21f7a592)
2011-08-08 13:21:23 +10:00
Martin Schwenke
b0e7237653 Eventscripts - remove some more absolute paths to commands.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f5b7cb03aaf19fb4b12fc3f0c14d98ee2d7b0798)
2011-08-04 17:14:11 +10:00
Martin Schwenke
8026b3ce5a Eventscripts - Rework the use of get_proc() for the bonding checks.
Call call_proc(), put the output into a variable and then use it.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 2dfdc997f432d522034922b43cb6f8f878d11ba7)
2011-08-03 20:12:48 +10:00
Martin Schwenke
6fd94af5cc Eventscripts: update 60.nfs service() start to use set_proc().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 70ebb30b90956bb1212287d267ccb72ea83740ca)
2011-08-03 20:01:38 +10:00
Martin Schwenke
4b516600a2 Eventscripts: update 10.interface to use set_proc() and get_proc().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 61b7f0172ba5c83c847c29fac3582c25c7754b68)
2011-08-03 19:58:25 +10:00
Martin Schwenke
cfdccc5cac Eventscripts: use set_proc() in startstop_nfs().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5a3d5c6b1ca3682bb45104e50061871dec6e9b1d)
2011-08-03 19:57:40 +10:00
Martin Schwenke
75bbc93c0b Eventscripts: remove unnecessary absolute paths from external commands.
For eventscript unit testing it will be necessary to override external
commands to allow stub implementations to be used.  If absolute paths
aren't used then this can be done using either a fake bin/
subdirectory or by using shell functions.

This removes all of the simple cases of absolute paths.

Signed-off-by: Martin Schwenke <martin@meltin.net>

Conflicts:

	config/ctdb.init
	config/events.d/50.samba

        Keep old code but remove absolute paths.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 05851d50b0078de8bf4691442d718825adca6fe8)
2011-08-03 17:19:15 +10:00
Martin Schwenke
5f4ab05766 Eventscripts: new functions set_proc() and get_proc().
These provide a thin layer around writing and reading files in /proc.
They can be easily replaced by stubs for unit testing.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 637f9d8af517b73c72ed8f3cc2a2661f11eb2126)
2011-08-03 17:04:58 +10:00
Martin Schwenke
571e55ac0d Eventscripts: remove ctdb_wait_command() and ctdb_wait_tcp_ports() functions.
These haven't been used for a long time.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f5fd361cadb3ea18d29e2d7215a7853718e48d00)
2011-08-03 17:02:41 +10:00
Martin Schwenke
e3a9991e46 Eventscripts: iptables() should put lock in $CTDB_VARDIR.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 3f04793f391c63b78ffb9c9851ab3f0daf3ed50a)
2011-08-03 16:55:43 +10:00
Martin Schwenke
3bbfdfcdd3 Make Emacs recognise that the eventscript functions file is a shell script.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a6dfb76cfa759f6f9409f24368111c4f85ca0fbf)
2011-08-03 16:49:38 +10:00
Martin Schwenke
3380c6ce1d Eventscript functions: add $CTDB_ETCDIR and hook service() functions.
* $CTDB_ETCDIR defaults to /etc but can be changed for testing.  All
  hard-coded instances of /etc have been changed to $CTDB_ETCDIR.
  This includes references to /etc/init.d and /etc/sysconfig.

* service() and nice_service() functions now call new function
  _service().  This makes it easier to override these functions (say,
  in rc.local) for testing and call most of the existing functionality
  using _service().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63)
2011-08-03 16:45:54 +10:00
Martin Schwenke
d31fbcab4b Set $CTDB_VARDIR in the functions file.
This will be needed when eventscripts that use it are called
externally.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ebd53b66b0cc66d9d04830781886234167fc2164)
2011-08-03 16:44:49 +10:00
Martin Schwenke
652bf326e1 Eventscripts - 10.interfaces should not check orphaned interfaces.
If the last IP address on an interfaces is removed then that
interfaces should no longer be checked by 10.interfaces.  However,
"ctdb ifaces" still lists such interfaces so they are currently
checked.

The problem really needs to be addressed in ctdbd but a neat quick
eventscript fix will be minimally invasive...

This changes the code to use "ctdb -Y ip -v" instead of "ctdb -Y
ifaces".  The former includes details of all public addresses and
associated interfaces, so when an address is removed there is no
output for it.  This avoids orphaned interfaces from being listed.

The logic is also slightly improved so that $IFACES includes just a
(non-uniquified) list of interfaces, allowing an existing loop to be
removed.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 49b2d1bd9554461ed8edbfc21e777c0eca9e1443)
2011-08-02 16:53:14 +10:00
Ronnie Sahlberg
18af72f08f change the name for the key for the record where we stoire the public address config from public-addresses... to public_addresses...
CQ1019030

(This used to be ctdb commit 114d5034ff4880848588caf493382a537a1469ae)
2011-06-28 15:40:46 +10:00
Mathieu Parent
c262fe6a8f Fix bashism
... again ;-)

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 2266586c1839af032622be54dc7f71e39d2bd9ef)
2011-05-14 22:30:25 +02:00
Ronnie Sahlberg
d020b2c950 When using multiple VLANs, some funky stuff can sometimes happen when
adding/removing IP addresses causing routes might be dropped by the system.

The easiest workaround for this is to unconditionally try to reapply
all static routes for all interfaces once ipreallocation has finished,
not just adding them back on the affected interface.

This worksaround a funky issue in
CQ S1023538

(This used to be ctdb commit 84600d1f53632d5fe76c308727f31f61b5ec1010)
2011-05-12 12:06:45 +10:00
Ronnie Sahlberg
d1edf44e4f If samba fails to start for some reason, make this cause the startup event to fail too, so that ctdbd will re-try the startup event later.
Or else this will leave samba not running.

CQ S1023394

(This used to be ctdb commit f90485b08d32cbe56050718a3b28ca0fe1d64e0f)
2011-05-10 09:59:38 +10:00
Ronnie Sahlberg
ee9e137759 Dont exit from checking interfaces once we have found one interface that is not
in use by public addresses.   this can happen when we have removed existing interfaces/ip addresses and prevents us from verifying the status of other interfaces

(This used to be ctdb commit d67955b42f7627be9dae995230c8fcbb8a948ec2)
2011-05-10 07:53:43 +10:00
Ronnie Sahlberg
2e2e37fdd6 Remove logging of spam/errors from the 10.interfrace
script if/when we have for example NATGW configured but no public addresses defined on that interface

CQ S1023378

(This used to be ctdb commit 8837daa424732aeb5a20814b1709c345a97a0e09)
2011-05-09 08:10:49 +10:00
Ronnie Sahlberg
d97e42183e bonding mode 4 monitoring:
we can not just check if MII Status is up for bonding mode 4, since the kernel will always report the bond device as UP
even if all cables are disconneccted.

For mode 4, ignore the status of the bond device and instead chek if at least one slave interface is up
when determining if the device is good or bad

(This used to be ctdb commit a6930cec6d9503dba18b9d4839d87a1c1a8ddba2)
2011-04-13 09:05:58 +10:00
Ronnie Sahlberg
c04505724a IFACE handling. Assume links are always good on nstartup (they almost always
Simplify the handling of setting the links in the 10.interface eventscript
and remove the optimization to only call setifacelink on state change
to make the code simpler to read.

If a take ip event fails, flag the node as unhealthy.

Add a check to the interface script to check if the interface exists
or if it has been deleted.
So that we can capture and become UNHELTHY if someone deletes an interface
we are using to host public addresses.

(This used to be ctdb commit 4ab63d2a7262aff30d5eced184c294c9c9dd4974)
2011-04-11 07:40:05 +10:00
Ronnie Sahlberg
55853a4683 NATGW: dont set arp_ignore in 11.natgw anymore since we no longer
need this for the natgw functionality

(This used to be ctdb commit bf3bf2967e3781c918e33b3a210e68e0ccca0c51)
2011-04-06 11:33:11 +10:00
Michael Adam
c9dc10292e ctdb.init: print a warning when tdbdump is found but tdbtoo or "tdbtool check" is not available
(This used to be ctdb commit afb26e38b617b85cdac14a7cd6dd3c85b8fddbc4)
2011-04-05 13:50:00 +02:00
Michael Adam
faa6d8d7e2 ctdb.init: check for availability of "tdbtool check" and "tdbdump"
Print a warning if neither is available.

(This used to be ctdb commit 4137d2a7d31cdce22847cebfc0239cfe2d8e937c)
2011-04-05 13:43:56 +02:00
Mathieu Parent
a5a6140b7e Correction of spelling errors
* continous -> continuous
* activete  -> activate

(thanks to lintian)

See https://bugzilla.samba.org/show_bug.cgi?id=6935

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit fb6987c2f747d6dbf9bb3899a480124d1c242a90)
2011-03-23 00:35:23 +01:00
Ronnie Sahlberg
a453e79050 50.samba : Tell winbind about every time we add/remove and ip from the node
CQ S1021636

(This used to be ctdb commit 87b279027616cffbcedfd534ac0032cd51238dfe)
2011-02-18 11:29:35 +11:00
Ronnie Sahlberg
d32a4dd501 remove checking for filesystems and filesystem health from the cnfs script.
remove the gpfsmount and gpfsumount entry points

(This used to be ctdb commit 7db5a4832a9555be53c301f198f72b9e075a8ae7)
2011-02-18 10:11:56 +11:00
Ronnie Sahlberg
ef0ab7eee1 60.nfs
Dont update the statd settings that often.
When we have very many nodes and very many ips, this would generate
a lot of unnessecary load on the system

(This used to be ctdb commit 0c030c9384500f340d8382c20e1e91b11aa377e9)
2011-02-18 10:10:34 +11:00
Martin Schwenke
59c5a9f279 Eventscripts: lower the fail/restart limits for nfsd.
We were potentially leaving a node unable to serve requests for too
long.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5be8610ffa33db49e33949560d0ef2fa5f3c0c73)
2011-01-11 16:49:46 +11:00
Martin Schwenke
96378d6dc8 Eventscripts: use "startstop_nfs restart" to reconfigure NFS.
This was defaulting to just "service nfs restart", which doesn't have
the workarounds we need.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 0f462e9e9fe12b595f3c7452123db8e69548abd6)
2011-01-11 16:49:14 +11:00
Martin Schwenke
3efd5ef77c Eventscripts: only autostart during a monitor event.
Otherwise we might short-circuit events that are run only once and
actually need to do something.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c4f9e8a43540bc049b2771e0a2d76d37b9d17331)
2011-01-11 16:48:50 +11:00
Martin Schwenke
fb8f199651 Eventscripts: print a message when reconfiguring a service.
Otherwise there can be strange error messages from services
stopping/starting, without any context.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 8bcf7ab164429ddc0ae530133e114f186a8146dd)
2011-01-11 16:48:17 +11:00
Martin Schwenke
934ae76d38 Eventscripts: work around NFS restart failure under load.
"service nfs restart" can fail.  To stop nfsd it sends a SIGINT and
nfsd might take a while to process it if the system is loaded.
Starting nfsd may then fail because resources are still in use.

This does some /proc magic to tell nfsd to do no more processing.  It
then runs service stop, kills nfsd with SIGKILL, and then runs service
start.  This is much less likely to fail.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a9bf4f82852975b0b627f61ceb2d23401f630805)
2011-01-11 16:47:43 +11:00
Ronnie Sahlberg
47aad74673 TYPO
(This used to be ctdb commit 38dc1ac2e87416a22c9356596286b773d601e71c)
2011-01-11 16:17:33 +11:00
Ronnie Sahlberg
2a3442d972 STATD is 100027 not 1000247
(This used to be ctdb commit f4cf15a2b06ffefde0cba803603b48040ad0fa05)
2011-01-11 16:16:28 +11:00
Ronnie Sahlberg
7e747aab8d 60.nfs Check if we have rpc.statd and if not, skip checking for statd
availability at all (since we cant restart it, there is not point checking
if it is alive)

(This used to be ctdb commit 6075e85ba6c0f58fd1ab2ce3b09dd3d6ff491365)
2011-01-06 15:49:15 +11:00
Ronnie Sahlberg
ded7c23122 41.HTTPD
Httpd can be very slow to start on some platforms,
wait 5 monitor intervals before we try to restart it if
it has not bound to port 80 yet.
After 10 failed intervals, flag the node as unhealthy.

(This used to be ctdb commit 6ec1993aa5f2778b8227ce5f6eca0d19e4ae9788)
2010-12-22 10:31:41 +11:00
Ronnie Sahlberg
e9ff38be7d 60.nfs
Try to restart LOCKD after 10 failures and
flag the node as unhealthy after 15 failures

(This used to be ctdb commit 5a67889c9166835aef3443051812d14af07dfca5)
2010-12-22 10:31:31 +11:00
Ronnie Sahlberg
57e74f6d8a Dont run net serverid wipe in the background
(This used to be ctdb commit 76c515f9f05f4fb5683b5ff65cf136c168fd882f)
2010-12-22 10:31:26 +11:00
Ronnie Sahlberg
97a6eccaf7 50.samba
Net serverid wipe can take a bit of time sometimes so background it.

Only perform auto start/stop of the managed service on the monitor event

(This used to be ctdb commit deba5cbbf7703a1a24ce88a06c73fca056e05521)
2010-12-14 21:19:28 +11:00
Ronnie Sahlberg
1e41ab5fa3 LVS
update lvs configuration on ipreallocated events too

(This used to be ctdb commit a4e98073d955676fdcbb91affae1de1a733d0bc2)
2010-12-13 14:24:16 +11:00
Ronnie Sahlberg
c26c6a01cf only run "serverid wipe" if we are actually running samba.
we dont need to run this on systems where we do run winbind but not samba

(This used to be ctdb commit fcb9e8d1e1c78439ea42adb8b05ad84fbca7f724)
2010-12-10 13:42:12 +11:00
Ronnie Sahlberg
8147d29598 add a missing part of the import of the previous ganesha patch
(This used to be ctdb commit 171b8855bb2feae7f7dd6a079571f3113dedd6f4)
2010-12-06 11:50:15 +11:00
Chandra Seetharaman
5e485d5ca0 make changes to ctdb event scripts to support NFS-Ganesha.
make changes to ctdb event scripts to support NFS-Ganesha.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>

(This used to be ctdb commit 7298588ed54492f106954c893dd86b0a36783470)
2010-12-06 11:50:12 +11:00
Ronnie Sahlberg
8959c8e850 dont try starting samba through the "init" event
(This used to be ctdb commit e314a449606418a4c4eac6eb319bfcdf1c398cd3)
2010-12-03 11:40:38 +11:00
Ronnie Sahlberg
6ed0009125 When we are no longer the natgw master, dont put the natgw ip on loopback.
We put the ip on loopback just to make sure we would still interoperate with
non-standard configurations on unix-KDC, that are configured to verify the optional
HostAddresses field.
This is not required for AD, since AD does not use this field, and is replaced in
unix land with other/better mechanisms than this "dodgy" check.

This makes it "easier" for applications that have bound to the natgw address
to detect a socket problem and try to reconnect/recover if the ip address
is completely missing from the system.

At the same time, use the winbind specific hook that exists to explicitely tell winbindd : this address is gone, so if you have bound to it, this is a good time to close and rebind your socket.

cq 1020333

(This used to be ctdb commit 0da94869d2912b2a412ba3fbd2137d88ce4e4389)
2010-11-29 12:45:59 +11:00
Ronnie Sahlberg
ebcc866ae0 update autostart/stop to work for samba
(This used to be ctdb commit 37ab57e2adaecc3f7996ea20af45a5df0cd8be76)
2010-11-22 20:42:26 +11:00
Ronnie Sahlberg
a3e7dfadca add an explicit _is_managed_service to iscsi eventscript
(This used to be ctdb commit 44f683a1ba15944d3306a0effd572de3280ff975)
2010-11-18 14:15:56 +11:00
Ronnie Sahlberg
193d9d50d1 Dont pollute the logs with a "file not found" message
CQ S1020745

(This used to be ctdb commit ea8bb7b26bb879a895c267d49672433182390d0d)
2010-11-18 13:54:15 +11:00
Martin Schwenke
c00db6f271 60.nfs eventscript should do nothing if NFS isn't managed by CTDB.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 582e5cd077501e8d4131a9c7981781471308edfd)
2010-11-18 13:36:40 +11:00
Martin Schwenke
a2af87482b Eventscript functions - catch failures in ctdb_service_start().
ctdb_service_start() currently succeeds if ctdb_counter_init()
succeeds.

This changes it to fail when a service start fails.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ddb73962d72d933bf0edc28be0dbb45bea7e5ef4)
2010-11-18 12:15:05 +11:00
Martin Schwenke
3ab768e8d4 50.samba eventscript should stop/start services when they become (un)managed.
When the value of $CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND (or
corresponding changes are made to $CTDB_MANAGED_VERSIONS), the
associated service should be started or stopped as necessary.

This add calls to ctdb_start_stop_service() to manage
starting/stopping samba and winbind.

An associated cleanup is made to the initial checks that one of
$CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND is set, replacing them
with calls to is_ctdb_managed_service().

To handle the winbind cases ctdb_start_stop_service() and
is_ctdb_managed_service() are updated to take an optional service name
parameter.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d98f175e8420d921a123ae9c0ce00945350b1537)
2010-11-18 12:12:30 +11:00
Ronnie Sahlberg
4fe85e5be5 add a new support function ctdb_check_counter_equal()
update nfs to try to restart the service after 10 consecutive failures
and to flag the node unhealthy after 15

add similar function to mountd

(This used to be ctdb commit 1569a54bb82fc433895ed68f816cf48399ad9d40)
2010-11-17 13:54:57 +11:00
Martin Schwenke
8fe1ec3754 Eventscripts: make loadconfig() function hookable by the test suite.
Rename loadconfig() to _loadconfig().  Add a new loadconfig() that
simply calls _loadconfig().

This makes it easy for the test suite to override loadconfig().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 1d77a3adfff893b3c01b87f791e72c0d3148425c)
2010-11-17 11:46:48 +11:00
Martin Schwenke
e23ca7dba5 Make a time comparison in 60.nfs eventscript more readable.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 26077e6c8eb126584af587e7416154ea4858aea2)
2010-11-17 11:44:26 +11:00
Martin Schwenke
6ab5ae2c9b 60.nfs only fails or warns after 10 consecutive nfsd/statd failures.
These failures are sometimes the result of slow restarts so we want to
avoid dirtying the logs or marking a node unhealthy because of them,
unless they are excessive.

For these 2 cases we use the existing fail counting code but hack a
temporary service_name in a subshell to allow separate fail counts.

We also update ctdb_check_rpc() so that it captures the error output
from rpcinfo and we add a message including the service name to the
beginning.  The error is printed to stdout but is also stored in
ctdb_check_rpc_out to allow it to be conditionally used by the caller.
This function also now returns non-zero rather than exiting on
failure.

Other direct rpcinfo calls are relaced by called to ctdb_check_rpc()
for consistency.

Option handling code for service restarts is cleaned up so that fits
in 80 columns.  A more informative restart messageis now used in all
cases, printing the exact command being used to start a service.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 79c25fe241cf5d8f92e23d3736823ebaf4e1769d)
2010-11-17 11:43:09 +11:00
Ronnie Sahlberg
055eafb790 this stuff is just so fragile that it will enter infinite recovery and fail loops
on any kind of tiny unexpected error

unconditionally try to remove ip addresses from both old and new interface
before trying to add it to the new interface to make it less
fragile

(This used to be ctdb commit 80acca2c91c9053c799365bae918db7ed8bdc56f)
2010-11-10 14:55:25 +11:00
Ronnie Sahlberg
ebed26d755 delete from old interface before adding to new interface
this stops the script from failing with an error if
both interfaces are specified as the same, which otherwise breaks and leads to an infinite recovery loop

(This used to be ctdb commit 565de03a784ed441490f8cd0b137b5cec8716d55)
2010-11-10 14:55:25 +11:00
Ronnie Sahlberg
76578b9533 dont delete all ips from the system during the initial "init" event
leave any ips as they are and let the recovery daemon remove them as required

(This used to be ctdb commit 8ab311719857847b4cf327507b0af1793551e73c)
2010-11-10 14:55:23 +11:00
Ronnie Sahlberg
a1cfa23d60 Both nfs and nfslock scripts can fail under redhat in very rare situations.
Ctdb can also be configured to ignore checking for knfsd and if it is alive.
In that situation, no attempt will be made to restart nfs, and sicne nfs is not running,  lockd can not be restarted either.

To workaround this, everytime we try to restart the lockmanager, also try to restart nfsd

(This used to be ctdb commit 953dbfbddad656a64e30a6aca115cb1479d11573)
2010-10-28 13:45:40 +11:00