1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-27 14:04:05 +03:00

342 Commits

Author SHA1 Message Date
Martin Schwenke
1d71dd08e3 Eventscripts: change failure counts and behaviour for statd and nfsd.
We reduce the number of failures before attempting a restart.
However, after 6 failures we mark the cluster unhealthy and no longer
try to restart.  If the previous 2 attempts didn't work then there
isn't any use in bogging the system down with an attempted restart on
every monitor event.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f654739080b40b7ac1b7f998cacc689d3d4e3193)
2011-08-12 14:16:17 +10:00
Martin Schwenke
398116ff29 Eventscripts: clean up 60.nfs monitor event.
This adds a helper function called nfs_check_rpc_service() and uses it
to make the monitor event much more readable.  An example of usage is
as follows:

  nfs_check_rpc_service "mountd" \
    -ge 10 "verbose restart:b unhealthy" \
    -eq 5 "restart:b"

The first argument to nfs_check_rpc_service() is the name of the RPC
service to be checked.  The RPC service corresponding to this command
is checked for availability using the rpcinfo command.  If the service
is available then the function succeeds and subsequent arguments are
ignored.

If the rpcinfo check fails then a failure counter for that particular
RPC service is incremented and subsequent arguments are processed in
groups of 3:

1. An integer comparison operator supported by test.
2. An integer failure limit.
3. An action string.

The value of the failure counter is checked using (1) and (2) above.
The first check that succeeds has its action string processed - note
that this explains the somewhat curious reverse ordering of checks.

It the example above:

* If the counter is >= 10 then a verbose message is printed
  describing the failure, the service is restarted in the background
  and the node is marked as unhealthy (via an "exit 1" from the
  function).

* If the counter is == 5 then the service us restarted in the
  background.

For more action options please see the code.

This also changes the ctdb_check_rpc() function so that it no longer
takes a program number to check.  It now just takes a real RPC program
name that rpcinfo can resolve via /etc/rpc.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e)
2011-08-12 14:16:14 +10:00
Ronnie Sahlberg
f9e58b502f Merge remote branch 'martins/eventscript.10.interface'
(This used to be ctdb commit 84ac667af408816e5508719b9fdb7c5e25408640)
2011-08-11 14:15:22 +10:00
Martin Schwenke
088620b026 Eventscripts: in 60.nfs move statd-notify code to service_reconfigure().
This means that it now occurs on every reconfigure event.  As a result
the ipreallocated event is removed.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c45a89418ba733ff91d48340d72bdb6d2ef80051)
2011-08-11 13:56:25 +10:00
Martin Schwenke
eef89f83b2 Eventscripts - 60.nfs should define service_reconfigure().
Not $service_reconfigure.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 642292d7ba7a95567964b4160c7ee31a4f8985d1)
2011-08-11 13:55:02 +10:00
Martin Schwenke
e66a1af9b3 Eventscripts: 50.samba - only start/stop nmbd if $CTDB_SERVICE_NMB set.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit defaec99df8c279d8e315d5010f9146e013afda2)
2011-08-11 10:46:57 +10:00
Martin Schwenke
8fb04d451e Eventscripts: 50.samba needs null service_reconfigure() function.
Samba doesn't need to do anything for configuration changes.  It will
notice configuration changes and reload automatically.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit de13350c17261032a7468c2cf4d2cf4a8d66a840)
2011-08-11 10:46:57 +10:00
Martin Schwenke
b01d99a8fa Eventscripts: 40.vsftpd service_stop() no longer /dev/null's output.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f928c201b6d0e1cd3e5568ae65186e3cee7c4988)
2011-08-11 10:46:57 +10:00
Martin Schwenke
1ea3616dcc Eventscripts: improvements to 41.httpd.
* Reduce the failure counts so that restart attempts happen sooner.

* Use service_start() and service_stop() for the restart.
  ctdb_service_start() resets the failure count, which isn't very
  useful in this context.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 01776b9f29af9ad5c8534649ece1bd100e450434)
2011-08-11 10:46:56 +10:00
Martin Schwenke
cd4074d2f8 Eventscripts: make 50.samba use $service_state_dir.
Signed-off-by: Martin Schwenke <martin@meltin.net>

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 0f003f05e28037eefdce3a686fcb52cd2289af9d)
2011-08-11 10:46:56 +10:00
Martin Schwenke
3d1f0100be Evenscripts: update 60.nfs to use ctdb_service_check_reconfigure.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 7c070b0bc86b3b9a91a9dc263b72c0567934535c)
2011-08-11 10:46:56 +10:00
Martin Schwenke
a35138a001 Evenscripts: update 60.nfs to use ctdb_setup_service_state_dir.
The state directory basename becomes "nfs" rather than "statd".  One
line of code i moved from the "startup" event to service_start().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit cc4c5c19af7efe01c48f73bb5ec5e607ed79db4c)
2011-08-11 10:46:20 +10:00
Martin Schwenke
d6c5fcfbae Evenscripts: update 40.vsftpd to use ctdb_service_check_reconfigure.
To simplify we also remove the reconfigure from the recovered event
because the monitor event will handle this very quickly anyway.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit da3aedd1a472b430b75989d3c157efedd382e327)
2011-08-11 10:46:20 +10:00
Martin Schwenke
4daf8bb1c8 Evenscripts: update 41.httpd to use ctdb_service_check_reconfigure.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 51c45b1c4751af41e5f9fd252763e0025f8cce3a)
2011-08-11 10:46:20 +10:00
Martin Schwenke
820d9b30ea Eventscripts: rejig the reconfigure infrastructure.
* Add an optional service name argument to existing reconfigure
  functions.

* User function service_reconfigure() instead of variable
  $service_reconfigure to specify how a service is reconfigured.

* New function ctdb_service_check_reconfigure() reconfigures a service
  if it is flagged for reconfigure.

* Remove $service_reconfigure settings from 40.vsftpd and 41.httpd -
  they're the defaults.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c)
2011-08-11 10:46:20 +10:00
Martin Schwenke
428e32d647 Eventscript function: change service_start into a function.
service_start is currently a variable.  This makes passing arguments
hard.  We change it to be a function and put default definitions into
the functions file.

We use a convention that if a service name argument is passed to a
redefined version of service_start() or service_stop() then it will
act unconditionally.  If no argument is passed then it can use
internal logic to decide if services should really be started.  This
is useful when a single eventscript handles multiple services.

This is a cherry-pick of ae38895 that needed to be reset mid-stream.
There is still some breakage following this commit.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 86e4aefed9fd1028660c98e3ea758c2b75ffc1d8)
2011-08-11 10:46:20 +10:00
Martin Schwenke
6ec2cfc7da 50.samba eventscript should use is_ctdb_managed_service "winbind".
Currently it checks $CTDB_MANAGES_WINBIND directly in several places.
This doesn't work when someone sets $CTDB_MANAGED_SERVICES directly.

This modifies check_ctdb_manages_winbind() so that it return a
condition rather than modifying $CTDB_MANAGES_WINBIND.  This makes
some code more readable.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 538902fbc1e74134a03987b36b3733ad641f8971)
2011-08-11 10:46:20 +10:00
Martin Schwenke
e96e655430 50.samba eventscript should use is_ctdb_managed_service "samba".
Currently it checks $CTDB_MANAGES_SAMBA directly.  This doesn't work
when someone sets $CTDB_MANAGED_SERVICES directly.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d8f0f8948abd340088720718fef7dc858661ba23)
2011-08-11 10:46:20 +10:00
Martin Schwenke
45bcf843ec 50.samba eventscript should stop/start services when they become (un)managed.
When the value of $CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND (or
corresponding changes are made to $CTDB_MANAGED_VERSIONS), the
associated service should be started or stopped as necessary.

This add calls to ctdb_start_stop_service() to manage
starting/stopping samba and winbind.

An associated cleanup is made to the initial checks that one of
$CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND is set, replacing them
with calls to is_ctdb_managed_service().

To handle the winbind cases ctdb_start_stop_service() and
is_ctdb_managed_service() are updated to take an optional service name
parameter.

Signed-off-by: Martin Schwenke <martin@meltin.net>

Conflicts:

	config/events.d/50.samba

	Most of this merged elsewhere.  This just removes a check that
	this is the monitor event.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 257a2e350280c0b76ed2fac588cad167381fda52)
2011-08-11 10:46:20 +10:00
Ronnie Sahlberg
ee96db07d5 Add new eventscript 40.fs_use that can be used to monitor file system use and flag a node unhealthy when they become full
(This used to be ctdb commit 2fd1babf8135ad5d53f3b25ba823d840ebc66460)
2011-08-11 10:04:40 +10:00
Ronnie Sahlberg
c8a18e8f9a make the persistent even longer for lvs to make people even happier
(This used to be ctdb commit 8158077624eb763ba40c6a7b4b7faf3867b205d7)
2011-08-11 09:12:38 +10:00
Ronnie Sahlberg
543701293f increase the persistent timeout to make people happier
(This used to be ctdb commit 68ea19cb02017e93769df7f6312d5e0bef55e605)
2011-08-11 07:14:57 +10:00
Ronnie Sahlberg
f9156adef5 check the shares if they are available before we decide to try to restart nfs
CQ S1027529

(This used to be ctdb commit b6c6a4588ccf6ef78fabfd76d228f56b4eb65165)
2011-08-11 07:14:16 +10:00
Martin Schwenke
4e60075228 Eventscripts - fix 10.interface bash incompatibility.
In dash, this fails gracefully with nothing to stderr:

  t=$(cat /does_not_exist) 2>/dev/null

In bash the error from cat is still printed due to different order of
evaluation.

This works everywhere:

  t=$(cat /does_not_exist 2>/dev/null)

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a6e61867c7a58d5a77cd8641d8df0b105cddff77)
2011-08-10 16:06:26 +10:00
Martin Schwenke
06f1004da4 Merge branch 'eventscript.20.multipathd' into eventscript.00.ctdb
(This used to be ctdb commit 8723b88b0b2bbeece38c74c77c50e8d8b3e2d5ca)
2011-08-10 15:32:58 +10:00
Martin Schwenke
383b203096 Merge branch 'eventscript.62.cnfs' into eventscript.20.multipathd
(This used to be ctdb commit fb87fa9273db4f82e801a331b5d95059d64dfb8e)
2011-08-10 15:32:11 +10:00
Martin Schwenke
7eae4aafca Merge branch 'eventscript.13.per_ip_routing' into eventscript.62.cnfs
(This used to be ctdb commit cfa4102ec0d97e1d1d3c1ce6407ffacdb85c2e10)
2011-08-10 15:31:13 +10:00
Martin Schwenke
098da255fa Evenscripts: update 61.cnfs to use ctdb_setup_service_state_dir.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit afafeb1fb12384bddff470d38b534f513a1f3b07)
2011-08-10 12:27:41 +10:00
Martin Schwenke
061b7adad6 Evenscripts: update 13.per_ip_routing to use ctdb_setup_service_state_dir.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 18e0236754507a9475653f04bb239c5d46ba51de)
2011-08-09 17:35:37 +10:00
Martin Schwenke
609a1e5c77 Evenscripts: update 20.multipathd to use ctdb_setup_service_state_dir.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 797ca65bdd59b14325ffd32b4d4140e9b01dbe71)
2011-08-09 17:28:09 +10:00
Martin Schwenke
f36bae1cbf Eventscripts: fix dangerous rm -rf in 00.ctdb init event.
Also remove some unnecessary absolute paths for commands, which were
making the code slightly difficult to read.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 1b3f2dd62efb240f8486016fe0f8dfb73d6ccc66)
2011-08-09 16:48:57 +10:00
Martin Schwenke
dd56cde3ff Eventscripts: 00.ctdb uses $service_state_dir, neaten update_config_from_tdb().
This also fixes a bug where update_config_from_tdb() used an incorrect
filename in one place.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a5ce2adaa39f077f56582072a97bb64d0eba4b4d)
2011-08-09 16:45:50 +10:00
Martin Schwenke
cbf030a72e 00.ctdb eventscript removes all files from $ctdb_active_dir.
Without this you can get into a situation where ctdbd can not start.
If the active file for a service exists but the service is not
running, then trying to stop the service may fail, causing the
eventscript to exit from ctdb_start_stop_service().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 28379ca0f747c5952d690a451834ce7421adfd34)
2011-08-09 16:42:27 +10:00
Martin Schwenke
eec654314a Eventscripts - Remove local variable usage in 10.interfaces.
POSIX sh doesn't have local variables.  Debian's dash doesn't behave
the same way as bash on this contruct:

  local var=`command that produces multiple words`

It only assigns the 1st word and may print an error.

Just remove the use of the "local" keyword in monitor_interfaces() to
solve this.  It isn't actually limiting the scope of any variables
that are used outside the function.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 95d9a1e19655461288a2c7e52abf9d01ab23e05a)
2011-08-08 15:44:30 +10:00
Martin Schwenke
b0e7237653 Eventscripts - remove some more absolute paths to commands.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f5b7cb03aaf19fb4b12fc3f0c14d98ee2d7b0798)
2011-08-04 17:14:11 +10:00
Martin Schwenke
8026b3ce5a Eventscripts - Rework the use of get_proc() for the bonding checks.
Call call_proc(), put the output into a variable and then use it.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 2dfdc997f432d522034922b43cb6f8f878d11ba7)
2011-08-03 20:12:48 +10:00
Martin Schwenke
6fd94af5cc Eventscripts: update 60.nfs service() start to use set_proc().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 70ebb30b90956bb1212287d267ccb72ea83740ca)
2011-08-03 20:01:38 +10:00
Martin Schwenke
4b516600a2 Eventscripts: update 10.interface to use set_proc() and get_proc().
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 61b7f0172ba5c83c847c29fac3582c25c7754b68)
2011-08-03 19:58:25 +10:00
Martin Schwenke
75bbc93c0b Eventscripts: remove unnecessary absolute paths from external commands.
For eventscript unit testing it will be necessary to override external
commands to allow stub implementations to be used.  If absolute paths
aren't used then this can be done using either a fake bin/
subdirectory or by using shell functions.

This removes all of the simple cases of absolute paths.

Signed-off-by: Martin Schwenke <martin@meltin.net>

Conflicts:

	config/ctdb.init
	config/events.d/50.samba

        Keep old code but remove absolute paths.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 05851d50b0078de8bf4691442d718825adca6fe8)
2011-08-03 17:19:15 +10:00
Martin Schwenke
652bf326e1 Eventscripts - 10.interfaces should not check orphaned interfaces.
If the last IP address on an interfaces is removed then that
interfaces should no longer be checked by 10.interfaces.  However,
"ctdb ifaces" still lists such interfaces so they are currently
checked.

The problem really needs to be addressed in ctdbd but a neat quick
eventscript fix will be minimally invasive...

This changes the code to use "ctdb -Y ip -v" instead of "ctdb -Y
ifaces".  The former includes details of all public addresses and
associated interfaces, so when an address is removed there is no
output for it.  This avoids orphaned interfaces from being listed.

The logic is also slightly improved so that $IFACES includes just a
(non-uniquified) list of interfaces, allowing an existing loop to be
removed.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 49b2d1bd9554461ed8edbfc21e777c0eca9e1443)
2011-08-02 16:53:14 +10:00
Ronnie Sahlberg
18af72f08f change the name for the key for the record where we stoire the public address config from public-addresses... to public_addresses...
CQ1019030

(This used to be ctdb commit 114d5034ff4880848588caf493382a537a1469ae)
2011-06-28 15:40:46 +10:00
Mathieu Parent
c262fe6a8f Fix bashism
... again ;-)

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 2266586c1839af032622be54dc7f71e39d2bd9ef)
2011-05-14 22:30:25 +02:00
Ronnie Sahlberg
d020b2c950 When using multiple VLANs, some funky stuff can sometimes happen when
adding/removing IP addresses causing routes might be dropped by the system.

The easiest workaround for this is to unconditionally try to reapply
all static routes for all interfaces once ipreallocation has finished,
not just adding them back on the affected interface.

This worksaround a funky issue in
CQ S1023538

(This used to be ctdb commit 84600d1f53632d5fe76c308727f31f61b5ec1010)
2011-05-12 12:06:45 +10:00
Ronnie Sahlberg
d1edf44e4f If samba fails to start for some reason, make this cause the startup event to fail too, so that ctdbd will re-try the startup event later.
Or else this will leave samba not running.

CQ S1023394

(This used to be ctdb commit f90485b08d32cbe56050718a3b28ca0fe1d64e0f)
2011-05-10 09:59:38 +10:00
Ronnie Sahlberg
ee9e137759 Dont exit from checking interfaces once we have found one interface that is not
in use by public addresses.   this can happen when we have removed existing interfaces/ip addresses and prevents us from verifying the status of other interfaces

(This used to be ctdb commit d67955b42f7627be9dae995230c8fcbb8a948ec2)
2011-05-10 07:53:43 +10:00
Ronnie Sahlberg
2e2e37fdd6 Remove logging of spam/errors from the 10.interfrace
script if/when we have for example NATGW configured but no public addresses defined on that interface

CQ S1023378

(This used to be ctdb commit 8837daa424732aeb5a20814b1709c345a97a0e09)
2011-05-09 08:10:49 +10:00
Ronnie Sahlberg
d97e42183e bonding mode 4 monitoring:
we can not just check if MII Status is up for bonding mode 4, since the kernel will always report the bond device as UP
even if all cables are disconneccted.

For mode 4, ignore the status of the bond device and instead chek if at least one slave interface is up
when determining if the device is good or bad

(This used to be ctdb commit a6930cec6d9503dba18b9d4839d87a1c1a8ddba2)
2011-04-13 09:05:58 +10:00
Ronnie Sahlberg
c04505724a IFACE handling. Assume links are always good on nstartup (they almost always
Simplify the handling of setting the links in the 10.interface eventscript
and remove the optimization to only call setifacelink on state change
to make the code simpler to read.

If a take ip event fails, flag the node as unhealthy.

Add a check to the interface script to check if the interface exists
or if it has been deleted.
So that we can capture and become UNHELTHY if someone deletes an interface
we are using to host public addresses.

(This used to be ctdb commit 4ab63d2a7262aff30d5eced184c294c9c9dd4974)
2011-04-11 07:40:05 +10:00
Ronnie Sahlberg
55853a4683 NATGW: dont set arp_ignore in 11.natgw anymore since we no longer
need this for the natgw functionality

(This used to be ctdb commit bf3bf2967e3781c918e33b3a210e68e0ccca0c51)
2011-04-06 11:33:11 +10:00
Mathieu Parent
a5a6140b7e Correction of spelling errors
* continous -> continuous
* activete  -> activate

(thanks to lintian)

See https://bugzilla.samba.org/show_bug.cgi?id=6935

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit fb6987c2f747d6dbf9bb3899a480124d1c242a90)
2011-03-23 00:35:23 +01:00