samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-23 17:34:34 +03:00

1463 lines

34 KiB

Plaintext

Raw Normal View History

Make Emacs recognise that the eventscript functions file is a shell script. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a6dfb76cfa759f6f9409f24368111c4f85ca0fbf) 2010-12-02 01:48:02 +03:00			`# Hey Emacs, this is a -- shell-script -- !!!`

split out events for each subsystem separately (This used to be ctdb commit 03c629a72f234dcc783fa1085e7edba09597c241) 2007-06-01 14:54:26 +04:00			`# utility functions for ctdb event scripts`

Set $CTDB_VARDIR in the functions file. This will be needed when eventscripts that use it are called externally. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ebd53b66b0cc66d9d04830781886234167fc2164) 2010-12-15 02:08:16 +03:00			`[ -z "$CTDB_VARDIR" ] && {`
config/functions: CTDB_VARDIR is /var/lib/ctdb on Debian-like systems (This used to be ctdb commit 56160eccb62178f645b017b1257677a1e854b2bc) 2011-10-13 22:26:05 +04:00			`if [ -d "/var/lib/ctdb" ] ; then`
			`export CTDB_VARDIR="/var/lib/ctdb"`
			`else`
			`export CTDB_VARDIR="/var/ctdb"`
			`fi`
Set $CTDB_VARDIR in the functions file. This will be needed when eventscripts that use it are called externally. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ebd53b66b0cc66d9d04830781886234167fc2164) 2010-12-15 02:08:16 +03:00			`}`
Eventscript functions: add $CTDB_ETCDIR and hook service() functions. * $CTDB_ETCDIR defaults to /etc but can be changed for testing. All hard-coded instances of /etc have been changed to $CTDB_ETCDIR. This includes references to /etc/init.d and /etc/sysconfig. * service() and nice_service() functions now call new function _service(). This makes it easier to override these functions (say, in rc.local) for testing and call most of the existing functionality using _service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63) 2011-06-07 09:57:29 +04:00			`[ -z "$CTDB_ETCDIR" ] && {`
			`export CTDB_ETCDIR="/etc"`
			`}`
Set $CTDB_VARDIR in the functions file. This will be needed when eventscripts that use it are called externally. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ebd53b66b0cc66d9d04830781886234167fc2164) 2010-12-15 02:08:16 +03:00
make the init scripts more portable about location of system config files (This used to be ctdb commit 65f3e2bc722e314b2c51c3bfdc544b408a8a64cf) 2007-06-03 16:07:07 +04:00			`#######################################`
			`# pull in a system config file, if any`
Eventscripts: make loadconfig() function hookable by the test suite. Rename loadconfig() to _loadconfig(). Add a new loadconfig() that simply calls _loadconfig(). This makes it easy for the test suite to override loadconfig(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 1d77a3adfff893b3c01b87f791e72c0d3148425c) 2010-08-31 11:40:40 +04:00			`_loadconfig() {`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`if [ -z "$1" ] ; then`
More untested eventscript factorisation. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ac655b0a65b32d809d47fec9821f7f31bb2fe2a7) 2009-11-19 07:00:17 +03:00			`foo="${service_config:-${service_name}}"`
			`if [ -n "$foo" ] ; then`
			`loadconfig "$foo"`
eventscripts: Fix regression in _loadconfig() fff88940f71058e4eefd65f50a6701389c005c17 introduced a regression. Without $service_name set by default, the CTDB configuration is no longer loaded when loadconfig() is called without any arguments. That's bad. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f1619a36c1beba11533052dc5728fa3adaa08870) 2013-05-14 08:56:26 +04:00			`return`
More untested eventscript factorisation. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ac655b0a65b32d809d47fec9821f7f31bb2fe2a7) 2009-11-19 07:00:17 +03:00			`fi`
eventscripts: Fix regression in _loadconfig() fff88940f71058e4eefd65f50a6701389c005c17 introduced a regression. Without $service_name set by default, the CTDB configuration is no longer loaded when loadconfig() is called without any arguments. That's bad. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f1619a36c1beba11533052dc5728fa3adaa08870) 2013-05-14 08:56:26 +04:00			`fi`

			`if [ "$1" != "ctdb" ] ; then`
eventscripts: stop loadconfig function from loading ctdb config file twice. If "$1" was empty than loadconfig would load the ctdb config twice. This stops that from happening. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0406d406da70aaee7ad6aac236114905c5d03ed2) 2010-01-22 09:19:12 +03:00			`loadconfig "ctdb"`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`fi`

eventscripts: Fix regression in _loadconfig() fff88940f71058e4eefd65f50a6701389c005c17 introduced a regression. Without $service_name set by default, the CTDB configuration is no longer loaded when loadconfig() is called without any arguments. That's bad. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f1619a36c1beba11533052dc5728fa3adaa08870) 2013-05-14 08:56:26 +04:00			`if [ -z "$1" ] ; then`
			`return`
			`fi`

Eventscript functions: add $CTDB_ETCDIR and hook service() functions. * $CTDB_ETCDIR defaults to /etc but can be changed for testing. All hard-coded instances of /etc have been changed to $CTDB_ETCDIR. This includes references to /etc/init.d and /etc/sysconfig. * service() and nice_service() functions now call new function _service(). This makes it easier to override these functions (say, in rc.local) for testing and call most of the existing functionality using _service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63) 2011-06-07 09:57:29 +04:00			`if [ -f $CTDB_ETCDIR/sysconfig/$1 ]; then`
			`. $CTDB_ETCDIR/sysconfig/$1`
			`elif [ -f $CTDB_ETCDIR/default/$1 ]; then`
			`. $CTDB_ETCDIR/default/$1`
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`elif [ -f $CTDB_BASE/sysconfig/$1 ]; then`
			`. $CTDB_BASE/sysconfig/$1`
make the init scripts more portable about location of system config files (This used to be ctdb commit 65f3e2bc722e314b2c51c3bfdc544b408a8a64cf) 2007-06-03 16:07:07 +04:00			`fi`
scripts: Add support for optional ctdbd.conf configuration file Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8f660d0dd52013e5876806be908e8e603aa6e968) 2013-09-23 10:22:36 +04:00
			`if [ "$1" = "ctdb" ] ; then`
			`_config="${CTDB_BASE}/ctdbd.conf"`
			`if [ -r "$_config" ] ; then`
			`. "$_config"`
			`fi`
			`fi`
make the init scripts more portable about location of system config files (This used to be ctdb commit 65f3e2bc722e314b2c51c3bfdc544b408a8a64cf) 2007-06-03 16:07:07 +04:00			`}`

Eventscripts: make loadconfig() function hookable by the test suite. Rename loadconfig() to _loadconfig(). Add a new loadconfig() that simply calls _loadconfig(). This makes it easy for the test suite to override loadconfig(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 1d77a3adfff893b3c01b87f791e72c0d3148425c) 2010-08-31 11:40:40 +04:00			`loadconfig () {`
			`_loadconfig "$@"`
			`}`

Eventscripts - new function ctdb_set_current_debuglevel() This function ensures that CTDB_CURRENT_DEBUGLEVEL is set. It works like this: 1. If it is already set then do nothing, since it might have been set some other way. The recommended "other way" would be to add a file in rc.local.d/. 2. If it is not set then set it by sourcing /var/ctdb/eventscript_debuglevel. 3. If this file does not exist then create it using output from "ctdb getdebug". If the optional 1st argument is set to "create" then don't source an existing file but create a new one instead - this is useful for creating the file just once in each event run in, say, 00.ctdb. If there's a problem getting the debug level from ctdb then it is silently set to 0 - no use spamming logs if our debug code is broken... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 93910921c8a25f2b029733cd938069ff7c7bdab7) 2011-08-17 03:00:46 +04:00			`##############################################################`

scripts: Use $CTDB_SCRIPT_DEBUGLEVEL instead of something more complex The current logic is horrible and creates an unnecessary file. Let's make the script debug level independent of ctddb's debug level. * Have debug() use $CTDB_SCRIPT_DEBUGLEVEL directly * Remove ctdb_set_current_debuglevel() * Remove the "getdebug" command from ctdb stub in eventscript unit tests * Update relevant eventscript unit tests to use $CTDB_SCRIPT_DEBUGLEVEL Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 85efa446c7f5c5af1c3a960001aa777775ae562f) 2013-04-17 07:12:32 +04:00			`# CTDB_SCRIPT_DEBUGLEVEL can be overwritten by setting it in a`
			`# configuration file.`
Eventscripts: add a debug() function and call ctdb_set_current_debuglevel() The debug function passes its arguments to echo if $CTDB_CURRENT_DEBUGLEVEL is >= 4 (i.e. DEBUG). If no args are given then use stdin - this allows the function to be used with here documents. To ensure $CTDB_CURRENT_DEBUGLEVEL is set, ctdb_set_current_debuglevel() is called near the end of the functions file. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 6143483d9f87322578c00f12081e381f425226ca) 2011-08-17 03:44:11 +04:00			`debug ()`
			`{`
scripts: Use $CTDB_SCRIPT_DEBUGLEVEL instead of something more complex The current logic is horrible and creates an unnecessary file. Let's make the script debug level independent of ctddb's debug level. * Have debug() use $CTDB_SCRIPT_DEBUGLEVEL directly * Remove ctdb_set_current_debuglevel() * Remove the "getdebug" command from ctdb stub in eventscript unit tests * Update relevant eventscript unit tests to use $CTDB_SCRIPT_DEBUGLEVEL Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 85efa446c7f5c5af1c3a960001aa777775ae562f) 2013-04-17 07:12:32 +04:00			`if [ ${CTDB_SCRIPT_DEBUGLEVEL:-2} -ge 4 ] ; then`
Eventscripts: add a debug() function and call ctdb_set_current_debuglevel() The debug function passes its arguments to echo if $CTDB_CURRENT_DEBUGLEVEL is >= 4 (i.e. DEBUG). If no args are given then use stdin - this allows the function to be used with here documents. To ensure $CTDB_CURRENT_DEBUGLEVEL is set, ctdb_set_current_debuglevel() is called near the end of the functions file. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 6143483d9f87322578c00f12081e381f425226ca) 2011-08-17 03:44:11 +04:00			`# If there are arguments then echo them. Otherwise expect to`
			`# use stdin, which allows us to pass lots of debug using a`
			`# here document.`
			`if [ -n "$1" ] ; then`
			`echo "DEBUG: $*"`
			`elif ! tty -s ; then`
			`sed -e 's@^@DEBUG: @'`
			`fi`
			`fi`
			`}`

Eventscript functions - add new function die() Args: 1. Error message to be printed. 2. Option exit code (default 1) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 97b0c138cb97e30db27c40b4ee1481109ae90c78) 2012-03-07 09:09:56 +04:00			`die ()`
			`{`
			`_msg="$1"`
			`_rc="${2:-1}"`

			`echo "$_msg"`
			`exit $_rc`
			`}`

scripts: Refactor logging code in initscript and functions file Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5ee242c949a98bb7397e0f7368b20d44c06fe772) 2012-10-16 10:04:48 +04:00			`# Log given message or stdin to either syslog or a CTDB log file`
			`# $1 is the tag passed to logger if syslog is in use.`
			`script_log ()`
eventscripts: Auto-start/stop services in background If $CTDB_SERVICE_AUTOSTARTSTOP="yes" then service start/stop is done in the background with logging. Fix some unit tests for samba and winbind. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3a3dae4cb5ec8b4b8381a4013adda25b87641f3a) 2012-09-03 09:37:01 +04:00			`{`
scripts: Refactor logging code in initscript and functions file Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5ee242c949a98bb7397e0f7368b20d44c06fe772) 2012-10-16 10:04:48 +04:00			`_tag="$1" ; shift`

ctdb-logging: New option CTDB_LOGGING, remove CTDB_LOGFILE, CTDB_SYSLOG Remove --logfile and --syslog daemon options and replace with --logging. Modularise and clean up logging initialisation code. The initialisation API includes an app_name argument that is currently unused - this will be used in extensions to the syslog backend. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-08-11 11:07:41 +04:00			`case "$CTDB_LOGGING" in`
			`file:*\|"")`
			`if [ -n "$CTDB_LOGGING" ] ; then`
			`_file="${CTDB_LOGGING#file:}"`
scripts: Refactor logging code in initscript and functions file Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5ee242c949a98bb7397e0f7368b20d44c06fe772) 2012-10-16 10:04:48 +04:00			`else`
ctdb-logging: New option CTDB_LOGGING, remove CTDB_LOGFILE, CTDB_SYSLOG Remove --logfile and --syslog daemon options and replace with --logging. Modularise and clean up logging initialisation code. The initialisation API includes an app_name argument that is currently unused - this will be used in extensions to the syslog backend. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-08-11 11:07:41 +04:00			`_file="/var/log/log.ctdb"`
scripts: Refactor logging code in initscript and functions file Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5ee242c949a98bb7397e0f7368b20d44c06fe772) 2012-10-16 10:04:48 +04:00			`fi`
ctdb-logging: New option CTDB_LOGGING, remove CTDB_LOGFILE, CTDB_SYSLOG Remove --logfile and --syslog daemon options and replace with --logging. Modularise and clean up logging initialisation code. The initialisation API includes an app_name argument that is currently unused - this will be used in extensions to the syslog backend. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-08-11 11:07:41 +04:00			`{`
			`if [ -n "$*" ] ; then`
			`echo "$*"`
			`else`
			`cat`
			`fi`
			`} >>"$_file"`
			`;;`
			`*)`
ctdb-logging: Add logging via UDP to 127.0.0.1:514 to syslog backend This has most of the advantages of the old logd with none of the complexity of the extra process. There are several good syslog implementations that can listen on the UDP port. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-08-08 14:59:21 +04:00			`# Handle all syslog:* variants here too. There's no tool to do`
			`# the lossy things, so just use logger.`
ctdb-logging: New option CTDB_LOGGING, remove CTDB_LOGFILE, CTDB_SYSLOG Remove --logfile and --syslog daemon options and replace with --logging. Modularise and clean up logging initialisation code. The initialisation API includes an app_name argument that is currently unused - this will be used in extensions to the syslog backend. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-08-11 11:07:41 +04:00			`logger -t "ctdbd: ${_tag}" $*`
			`;;`
			`esac`
scripts: Refactor logging code in initscript and functions file Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5ee242c949a98bb7397e0f7368b20d44c06fe772) 2012-10-16 10:04:48 +04:00			`}`

			`# When things are run in the background in an eventscript then logging`
			`# output might get lost. This is the "solution". :-)`
			`background_with_logging ()`
			`{`
eventscripts: Auto-start/stop services in background If $CTDB_SERVICE_AUTOSTARTSTOP="yes" then service start/stop is done in the background with logging. Fix some unit tests for samba and winbind. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3a3dae4cb5ec8b4b8381a4013adda25b87641f3a) 2012-09-03 09:37:01 +04:00			`(`
			`"$@" 2>&1 </dev/null \|`
scripts: Ensure even external scripts get tagged in logs as "ctdbd" Our practice is to search logs for "ctdbd:". We want to make sure we find everything. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5940a2494e9e43a83f2bca098bd04dfc1a8f2e93) 2013-04-22 07:48:06 +04:00			`script_log "${script_name}&"`
eventscripts: Auto-start/stop services in background If $CTDB_SERVICE_AUTOSTARTSTOP="yes" then service start/stop is done in the background with logging. Fix some unit tests for samba and winbind. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3a3dae4cb5ec8b4b8381a4013adda25b87641f3a) 2012-09-03 09:37:01 +04:00			`)&`

			`return 0`
			`}`

Eventscripts - new function ctdb_check_args() Pass this "$@" to do common eventscript argument checking. For regular use putting this in 00.ctdb would be enough. However, for developer testing it can be useful to call this in other eventscripts. For example, 10.interfaces and 13.per_ip_routing currently check these by hand. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 36de7e7fd6dfeed61ef9977b8d5b568f90a9707b) 2011-08-23 10:32:34 +04:00			`##############################################################`
			`# check number of args for different events`
			`ctdb_check_args ()`
			`{`
			`case "$1" in`
			`takeip\|releaseip)`
			`if [ $# != 4 ]; then`
			`echo "ERROR: must supply interface, IP and maskbits"`
			`exit 1`
			`fi`
			`;;`
			`updateip)`
			`if [ $# != 5 ]; then`
			`echo "ERROR: must supply old interface, new interface, IP and maskbits"`
			`exit 1`
			`fi`
			`;;`
			`esac`
			`}`

functions: add detect_init_style(). Michael (This used to be ctdb commit ab34a9480b59c649a4fc73a466c8ca0975453ed9) 2009-01-16 15:26:57 +03:00			`##############################################################`
			`# determine on what type of system (init style) we are running`
scripts: Make detect_init_style() more readable Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 516cdea0e73cf3f63b3303e22809834c8cbc64e4) 2013-10-18 06:24:03 +04:00			`detect_init_style()`
			`{`
functions: add detect_init_style(). Michael (This used to be ctdb commit ab34a9480b59c649a4fc73a466c8ca0975453ed9) 2009-01-16 15:26:57 +03:00			`# only do detection if not already set:`
scripts: Make detect_init_style() more readable Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 516cdea0e73cf3f63b3303e22809834c8cbc64e4) 2013-10-18 06:24:03 +04:00			`[ -z "$CTDB_INIT_STYLE" ] \|\| return`
functions: add detect_init_style(). Michael (This used to be ctdb commit ab34a9480b59c649a4fc73a466c8ca0975453ed9) 2009-01-16 15:26:57 +03:00
			`if [ -x /sbin/startproc ]; then`
			`CTDB_INIT_STYLE="suse"`
			`elif [ -x /sbin/start-stop-daemon ]; then`
Revert "try to restart statd everytime it fails, not just the first time" This reverts commit 4f7b39a4871af28df1c4545ec37db179fa47a7da. (This used to be ctdb commit db7b96304e4725f29b12398b7582e385daed63ed) 2009-09-15 13:33:35 +04:00			`CTDB_INIT_STYLE="debian"`
functions: add detect_init_style(). Michael (This used to be ctdb commit ab34a9480b59c649a4fc73a466c8ca0975453ed9) 2009-01-16 15:26:57 +03:00			`else`
			`CTDB_INIT_STYLE="redhat"`
			`fi`
			`}`
split out events for each subsystem separately (This used to be ctdb commit 03c629a72f234dcc783fa1085e7edba09597c241) 2007-06-01 14:54:26 +04:00
add an easy way to setup ctdb to start/stop samba (This used to be ctdb commit b0d9f427d83aff5b9a5c54b7b7c9d45d418e2352) 2007-06-02 12:51:05 +04:00			`######################################################`
			`# simulate /sbin/service on platforms that don't have it`
Eventscript functions: add $CTDB_ETCDIR and hook service() functions. * $CTDB_ETCDIR defaults to /etc but can be changed for testing. All hard-coded instances of /etc have been changed to $CTDB_ETCDIR. This includes references to /etc/init.d and /etc/sysconfig. * service() and nice_service() functions now call new function _service(). This makes it easier to override these functions (say, in rc.local) for testing and call most of the existing functionality using _service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63) 2011-06-07 09:57:29 +04:00			`# _service() makes it easier to hook the service() function for`
			`# testing.`
			`_service ()`
			`{`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`_service_name="$1"`
			`_op="$2"`
funcions: make (nice_)service a noop for empty service name Michael (This used to be ctdb commit 4cac2a16b70be772e4f1520020762f63c0bf3efe) 2009-01-16 15:31:02 +03:00
			`# do nothing, when no service was specified`
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`[ -z "$_service_name" ] && return`
funcions: make (nice_)service a noop for empty service name Michael (This used to be ctdb commit 4cac2a16b70be772e4f1520020762f63c0bf3efe) 2009-01-16 15:31:02 +03:00
add an easy way to setup ctdb to start/stop samba (This used to be ctdb commit b0d9f427d83aff5b9a5c54b7b7c9d45d418e2352) 2007-06-02 12:51:05 +04:00			`if [ -x /sbin/service ]; then`
Eventscript functions: add $CTDB_ETCDIR and hook service() functions. * $CTDB_ETCDIR defaults to /etc but can be changed for testing. All hard-coded instances of /etc have been changed to $CTDB_ETCDIR. This includes references to /etc/init.d and /etc/sysconfig. * service() and nice_service() functions now call new function _service(). This makes it easier to override these functions (say, in rc.local) for testing and call most of the existing functionality using _service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63) 2011-06-07 09:57:29 +04:00			`$_nice /sbin/service "$_service_name" "$_op"`
			`elif [ -x $CTDB_ETCDIR/init.d/$_service_name ]; then`
			`$_nice $CTDB_ETCDIR/init.d/$_service_name "$_op"`
			`elif [ -x $CTDB_ETCDIR/rc.d/init.d/$_service_name ]; then`
			`$_nice $CTDB_ETCDIR/rc.d/init.d/$_service_name "$_op"`
add an easy way to setup ctdb to start/stop samba (This used to be ctdb commit b0d9f427d83aff5b9a5c54b7b7c9d45d418e2352) 2007-06-02 12:51:05 +04:00			`fi`
			`}`

Eventscript functions: add $CTDB_ETCDIR and hook service() functions. * $CTDB_ETCDIR defaults to /etc but can be changed for testing. All hard-coded instances of /etc have been changed to $CTDB_ETCDIR. This includes references to /etc/init.d and /etc/sysconfig. * service() and nice_service() functions now call new function _service(). This makes it easier to override these functions (say, in rc.local) for testing and call most of the existing functionality using _service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63) 2011-06-07 09:57:29 +04:00			`service()`
			`{`
			`_nice=""`
			`_service "$@"`
			`}`

from Mathieu PARENT <math.parent@gmail.com> Simulate "nice service" on systems that do not have "service" (This used to be ctdb commit d0e6dcbadaf41745d423640e5ff5bafd9f68eb88) 2008-02-13 00:20:20 +03:00			`######################################################`
			`# simulate /sbin/service (niced) on platforms that don't have it`
Eventscript functions: add $CTDB_ETCDIR and hook service() functions. * $CTDB_ETCDIR defaults to /etc but can be changed for testing. All hard-coded instances of /etc have been changed to $CTDB_ETCDIR. This includes references to /etc/init.d and /etc/sysconfig. * service() and nice_service() functions now call new function _service(). This makes it easier to override these functions (say, in rc.local) for testing and call most of the existing functionality using _service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63) 2011-06-07 09:57:29 +04:00			`nice_service()`
			`{`
			`_nice="nice"`
			`_service "$@"`
from Mathieu PARENT <math.parent@gmail.com> Simulate "nice service" on systems that do not have "service" (This used to be ctdb commit d0e6dcbadaf41745d423640e5ff5bafd9f68eb88) 2008-02-13 00:20:20 +03:00			`}`
- wait for winbind on samba start - use $PATH for ctdb status (This used to be ctdb commit cf8d837cead1cbcb22c71ebbc3947970d1a565a3) 2007-06-17 05:57:42 +04:00
Eventscripts: new functions set_proc() and get_proc(). These provide a thin layer around writing and reading files in /proc. They can be easily replaced by stubs for unit testing. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 637f9d8af517b73c72ed8f3cc2a2661f11eb2126) 2011-06-28 08:54:33 +04:00			`######################################################`
			`# wrapper around /proc/ settings to allow them to be hooked`
			`# for testing`
			`# 1st arg is relative path under /proc/, 2nd arg is value to set`
			`set_proc ()`
			`{`
			`echo "$2" >"/proc/$1"`
			`}`

			`######################################################`
			`# wrapper around getting file contents from /proc/ to allow`
			`# this to be hooked for testing`
			`# 1st arg is relative path under /proc/`
			`get_proc ()`
			`{`
			`cat "/proc/$1"`
			`}`

ctdb-scripts: Factor out new function program_stack_traces() In the process, fix a bug where an extra trace would be printed. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-14 05:31:03 +03:00			`######################################################`
			`# Print up to $_max kernel stack traces for processes named $_program`
			`program_stack_traces ()`
			`{`
			`_prog="$1"`
			`_max="${2:-1}"`

			`_count=1`
			`for _pid in $(pidof "$_prog") ; do`
			`[ $_count -le $_max ] \|\| break`

			`# Do this first to avoid racing with process exit`
			`_stack=$(get_proc "${_pid}/stack" 2>/dev/null)`
			`if [ -n "$_stack" ] ; then`
			`echo "Stack trace for ${_prog}[${_pid}]:"`
			`echo "$_stack"`
			`_count=$(($_count + 1))`
			`fi`
			`done`
			`}`

Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`######################################################`
			`# Check that an RPC service is healthy -`
			`# this includes allowing a certain number of failures`
			`# before marking the NFS service unhealthy.`
			`#`
			`# usage: nfs_check_rpc_service SERVICE_NAME [ triple ...]`
			`#`
			`# each triple is a set of 3 arguments: an operator, a`
			`# fail count limit and an action string.`
			`#`
			`# For example:`
			`#`
			`# nfs_check_rpc_service "lockd" \`
			`# -ge 15 "verbose restart unhealthy" \`
			`# -eq 10 "restart:bs"`
			`#`
			`# says that if lockd is down for 15 iterations then do`
			`# a verbose restart of lockd and mark the node unhealthy.`
			`# Before this, after 10 iterations of failure, the`
			`# service is restarted silently in the background.`
			`# Order is important: the number of failures need to be`
			`# specified in reverse order because processing stops`
			`# after the first condition that is true.`
			`######################################################`
			`nfs_check_rpc_service ()`
			`{`
			`_prog_name="$1" ; shift`

eventscripts: Factor out common code from nfs_check_rpc_service() This creates new function _nfs_check_rpc_common(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cc3bb42e48bbdabd19187c231846b98589b4f4f3) 2013-04-23 00:27:02 +04:00			`if _nfs_check_rpc_common "$_prog_name" ; then`
			`return`
			`fi`

			`while [ -n "$3" ] ; do`
eventscripts: nfs_check_rpc_action() should be _nfs_check_rpc_action() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5a717fd495ba5a2bfd481d69f38b68fa4576716f) 2013-04-23 00:28:27 +04:00			`if _nfs_check_rpc_action "$1" "$2" "$3" ; then`
eventscripts: Factor out common code from nfs_check_rpc_service() This creates new function _nfs_check_rpc_common(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cc3bb42e48bbdabd19187c231846b98589b4f4f3) 2013-04-23 00:27:02 +04:00			`break`
			`fi`
			`shift 3`
			`done`
			`}`

eventscripts: New function nfs_check_rpc_services() This is intended to replace nfs_check_rpc_service(), which builds configuration into eventscripts. nfs_check_rpc_services() uses a directory of configuration checks that can be edited by an administrator. The files have one limit check and a set of actions per line. The program name is extracted from the file name. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9bc8fbee6550ed2814fb35c70d57fab21ef1b8fd) 2013-04-23 00:42:54 +04:00			`# The new way of doing things...`
			`nfs_check_rpc_services ()`
			`{`
			`# Files must end with .check - avoids editor backups, RPM fu, ...`
			`for _f in "${CTDB_BASE}/nfs-rpc-checks.d/"[0-9][0-9].*.check ; do`
			`_t="${_f%.check}"`
			`_prog_name="${_t##*/[0-9][0-9].}"`

			`if _nfs_check_rpc_common "$_prog_name" ; then`
			`# This RPC service is up, check next service...`
			`continue`
			`fi`

			`# Check each line in the file in turn until one of the limit`
			`# checks is hit...`
			`while read _cmp _lim _rest ; do`
			`# Skip comments`
			`case "$_cmp" in`
			`\#*) continue ;;`
			`esac`

			`if _nfs_check_rpc_action "$_cmp" "$_lim" "$_rest" ; then`
			`# Limit was hit on this line, no further checking...`
			`break`
			`fi`
			`done <"$_f"`
			`done`
			`}`

eventscripts: Factor out common code from nfs_check_rpc_service() This creates new function _nfs_check_rpc_common(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cc3bb42e48bbdabd19187c231846b98589b4f4f3) 2013-04-23 00:27:02 +04:00			`_nfs_check_rpc_common ()`
			`{`
			`_prog_name="$1"`

eventscripts: Move rpc.statd existence check into nfs_check_rpc_service () The code in 60.nfs is going to be genericised, so make all the checks look the same. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15b0f78cbf8d6ba481b7eba9e4fe3f4270214c72) 2013-04-22 23:54:12 +04:00			`# Some platforms don't have separate programs for all services.`
			`case "$_prog_name" in`
			`statd)`
			`which "rpc.${_prog_name}" >/dev/null 2>&1 \|\| return 0`
			`esac`

Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`case "$_prog_name" in`
eventscripts: NFS RPC checks no longer support "knfsd" No longer used, support removed from test infrastructure. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0eb351ff4c7ee096de7c5e0a59561067091fa32e) 2013-04-23 06:30:33 +04:00			`nfsd)`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`_rpc_prog=nfs`
Revert "Eventscript functions: add optional version to nfs_check_rpc_service()" This reverts commit 92f74fd589467b46c758e116e97417edfe8773d7. This change is unused and is just complicating the function. Conflicts: config/functions (This used to be ctdb commit 77302dbfd85754e02559eccb2dd6c090db0b6b9f) 2013-04-23 00:14:43 +04:00			`_version=3`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`;;`
			`mountd)`
eventscripts: Separate out RPC service restart code While doing this: * Explicitly assign RPC program and version information in _nfs_check_rpc_common(). This is more lines of code but is easier to read. * Don't print the options when starting a service. Trying to print it makes the code messy for little benefit. Update the eventscript unit testing code and a Ganesha test to reflect this. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8b531405665885196c95fe1608db33a255bf761) 2013-08-02 10:05:46 +04:00			`_rpc_prog=mountd`
			`_version=1`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`;;`
			`rquotad)`
eventscripts: Separate out RPC service restart code While doing this: * Explicitly assign RPC program and version information in _nfs_check_rpc_common(). This is more lines of code but is easier to read. * Don't print the options when starting a service. Trying to print it makes the code messy for little benefit. Update the eventscript unit testing code and a Ganesha test to reflect this. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8b531405665885196c95fe1608db33a255bf761) 2013-08-02 10:05:46 +04:00			`_rpc_prog=rquotad`
			`_version=1`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`;;`
			`lockd)`
			`_rpc_prog=nlockmgr`
Revert "Eventscript functions: add optional version to nfs_check_rpc_service()" This reverts commit 92f74fd589467b46c758e116e97417edfe8773d7. This change is unused and is just complicating the function. Conflicts: config/functions (This used to be ctdb commit 77302dbfd85754e02559eccb2dd6c090db0b6b9f) 2013-04-23 00:14:43 +04:00			`_version=4`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`;;`
			`statd)`
			`_rpc_prog=status`
eventscripts: Separate out RPC service restart code While doing this: * Explicitly assign RPC program and version information in _nfs_check_rpc_common(). This is more lines of code but is easier to read. * Don't print the options when starting a service. Trying to print it makes the code messy for little benefit. Update the eventscript unit testing code and a Ganesha test to reflect this. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8b531405665885196c95fe1608db33a255bf761) 2013-08-02 10:05:46 +04:00			`_version=1`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`;;`
			`*)`
			`echo "Internal error: unknown RPC program \"$_prog_name\"."`
			`exit 1`
			`esac`

			`_service_name="nfs_${_prog_name}"`

			`if ctdb_check_rpc "$_rpc_prog" $_version >/dev/null ; then`
			`ctdb_counter_init "$_service_name"`
			`return 0`
			`fi`

			`ctdb_counter_incr "$_service_name"`

eventscripts: Factor out common code from nfs_check_rpc_service() This creates new function _nfs_check_rpc_common(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cc3bb42e48bbdabd19187c231846b98589b4f4f3) 2013-04-23 00:27:02 +04:00			`return 1`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`}`

eventscripts: nfs_check_rpc_action() should be _nfs_check_rpc_action() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5a717fd495ba5a2bfd481d69f38b68fa4576716f) 2013-04-23 00:28:27 +04:00			`_nfs_check_rpc_action ()`
eventscripts: Factor NFS RPC check action code into nfs_check_rpc_action() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4b4e7d8f0e8dcbab987e374d06ffaa21c06da0d3) 2013-04-22 09:45:13 +04:00			`{`
			`_cmp="$1"`
			`_limit="$2"`
			`_actions="$3"`

			`if ctdb_check_counter "quiet" "$_cmp" "$_limit" "$_service_name" ; then`
			`return 1`
			`fi`

			`for _action in $_actions ; do`
			`case "$_action" in`
			`verbose)`
			`echo "$ctdb_check_rpc_out"`
			`;;`
eventscripts: Separate out RPC service restart code While doing this: * Explicitly assign RPC program and version information in _nfs_check_rpc_common(). This is more lines of code but is easier to read. * Don't print the options when starting a service. Trying to print it makes the code messy for little benefit. Update the eventscript unit testing code and a Ganesha test to reflect this. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8b531405665885196c95fe1608db33a255bf761) 2013-08-02 10:05:46 +04:00			`restart)`
			`_nfs_restart_rpc_service "$_prog_name"`
			`;;`
			`restart:b)`
			`_nfs_restart_rpc_service "$_prog_name" true`
eventscripts: Factor NFS RPC check action code into nfs_check_rpc_action() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4b4e7d8f0e8dcbab987e374d06ffaa21c06da0d3) 2013-04-22 09:45:13 +04:00			`;;`
			`unhealthy)`
			`exit 1`
			`;;`
			`*)`
			`echo "Internal error: unknown action \"$_action\"."`
			`exit 1`
			`esac`
			`done`

			`return 0`
			`}`

eventscripts: Separate out RPC service restart code While doing this: * Explicitly assign RPC program and version information in _nfs_check_rpc_common(). This is more lines of code but is easier to read. * Don't print the options when starting a service. Trying to print it makes the code messy for little benefit. Update the eventscript unit testing code and a Ganesha test to reflect this. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8b531405665885196c95fe1608db33a255bf761) 2013-08-02 10:05:46 +04:00			`_nfs_restart_rpc_service ()`
			`{`
			`_prog_name="$1"`
			`_background="${2:-false}"`

			`if $_background ; then`
			`_maybe_background="background_with_logging"`
			`else`
			`_maybe_background=""`
			`fi`

			`_p="rpc.${_prog_name}"`

			`case "$_prog_name" in`
			`nfsd)`
			`echo "Trying to restart NFS service"`
			`$_maybe_background startstop_nfs restart`
			`;;`
			`mountd)`
			`echo "Trying to restart $_prog_name [${_p}]"`
			`killall -q -9 "$_p"`
ctdb-scripts: Dump stack traces for hung mountd, rquotad, statd processes Add a corresponding new unit test for statd. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-14 05:59:16 +03:00			`nfs_dump_some_threads "$_p"`
eventscripts: Separate out RPC service restart code While doing this: * Explicitly assign RPC program and version information in _nfs_check_rpc_common(). This is more lines of code but is easier to read. * Don't print the options when starting a service. Trying to print it makes the code messy for little benefit. Update the eventscript unit testing code and a Ganesha test to reflect this. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8b531405665885196c95fe1608db33a255bf761) 2013-08-02 10:05:46 +04:00			`$_maybe_background $_p ${MOUNTD_PORT:+-p} $MOUNTD_PORT`
			`;;`
			`rquotad)`
			`echo "Trying to restart $_prog_name [${_p}]"`
			`killall -q -9 "$_p"`
ctdb-scripts: Dump stack traces for hung mountd, rquotad, statd processes Add a corresponding new unit test for statd. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-14 05:59:16 +03:00			`nfs_dump_some_threads "$_p"`
eventscripts: Separate out RPC service restart code While doing this: * Explicitly assign RPC program and version information in _nfs_check_rpc_common(). This is more lines of code but is easier to read. * Don't print the options when starting a service. Trying to print it makes the code messy for little benefit. Update the eventscript unit testing code and a Ganesha test to reflect this. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8b531405665885196c95fe1608db33a255bf761) 2013-08-02 10:05:46 +04:00			`$_maybe_background $_p ${RQUOTAD_PORT:+-p} $RQUOTAD_PORT`
			`;;`
			`lockd)`
			`echo "Trying to restart lock manager service"`
			`$_maybe_background startstop_nfslock restart`
			`;;`
			`statd)`
			`echo "Trying to restart $_prog_name [${_p}]"`
			`killall -q -9 "$_p"`
ctdb-scripts: Dump stack traces for hung mountd, rquotad, statd processes Add a corresponding new unit test for statd. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-14 05:59:16 +03:00			`nfs_dump_some_threads "$_p"`
eventscripts: Separate out RPC service restart code While doing this: * Explicitly assign RPC program and version information in _nfs_check_rpc_common(). This is more lines of code but is easier to read. * Don't print the options when starting a service. Trying to print it makes the code messy for little benefit. Update the eventscript unit testing code and a Ganesha test to reflect this. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8b531405665885196c95fe1608db33a255bf761) 2013-08-02 10:05:46 +04:00			`$_maybe_background $_p \`
			`${STATD_HOSTNAME:+-n} $STATD_HOSTNAME \`
			`${STATD_PORT:+-p} $STATD_PORT \`
			`${STATD_OUTGOING_PORT:+-o} $STATD_OUTGOING_PORT`
			`;;`
			`*)`
			`echo "Internal error: unknown RPC program \"$_prog_name\"."`
			`exit 1`
			`esac`
			`}`

- added monitoring of rpc ports for nfs, and of Samba ports and directories - added monitoring of the ethernet link state When monitoring detects an error, the node loses its public IP address (This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501) 2007-06-06 06:08:42 +04:00			`######################################################`
			`# check that a rpc server is registered with portmap`
			`# and responding to requests`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`# usage: ctdb_check_rpc SERVICE_NAME VERSION`
- added monitoring of rpc ports for nfs, and of Samba ports and directories - added monitoring of the ethernet link state When monitoring detects an error, the node loses its public IP address (This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501) 2007-06-06 06:08:42 +04:00			`######################################################`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`ctdb_check_rpc ()`
			`{`
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`progname="$1"`
Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e) 2010-12-17 08:25:04 +03:00			`version="$2"`
60.nfs only fails or warns after 10 consecutive nfsd/statd failures. These failures are sometimes the result of slow restarts so we want to avoid dirtying the logs or marking a node unhealthy because of them, unless they are excessive. For these 2 cases we use the existing fail counting code but hack a temporary service_name in a subshell to allow separate fail counts. We also update ctdb_check_rpc() so that it captures the error output from rpcinfo and we add a message including the service name to the beginning. The error is printed to stdout but is also stored in ctdb_check_rpc_out to allow it to be conditionally used by the caller. This function also now returns non-zero rather than exiting on failure. Other direct rpcinfo calls are relaced by called to ctdb_check_rpc() for consistency. Option handling code for service restarts is cleaned up so that fits in 80 columns. A more informative restart messageis now used in all cases, printing the exact command being used to start a service. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 79c25fe241cf5d8f92e23d3736823ebaf4e1769d) 2010-11-16 11:31:18 +03:00
eventscripts: New configuration variable $CTDB_RPCINFO_LOCALHOST Passing "localhost" to the rpcinfo command causes overheads, like reading /etc/services multiple times. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1d61988af9e4fa3621a3e2d06a859bcb53df2d67) 2013-08-05 09:12:14 +04:00			`_localhost="${CTDB_RPCINFO_LOCALHOST:-127.0.0.1}"`

			`if ! ctdb_check_rpc_out=$(rpcinfo -u $_localhost $progname $version 2>&1) ; then`
60.nfs only fails or warns after 10 consecutive nfsd/statd failures. These failures are sometimes the result of slow restarts so we want to avoid dirtying the logs or marking a node unhealthy because of them, unless they are excessive. For these 2 cases we use the existing fail counting code but hack a temporary service_name in a subshell to allow separate fail counts. We also update ctdb_check_rpc() so that it captures the error output from rpcinfo and we add a message including the service name to the beginning. The error is printed to stdout but is also stored in ctdb_check_rpc_out to allow it to be conditionally used by the caller. This function also now returns non-zero rather than exiting on failure. Other direct rpcinfo calls are relaced by called to ctdb_check_rpc() for consistency. Option handling code for service restarts is cleaned up so that fits in 80 columns. A more informative restart messageis now used in all cases, printing the exact command being used to start a service. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 79c25fe241cf5d8f92e23d3736823ebaf4e1769d) 2010-11-16 11:31:18 +03:00			`ctdb_check_rpc_out="ERROR: $progname failed RPC check:`
			`$ctdb_check_rpc_out"`
			`echo "$ctdb_check_rpc_out"`
			`return 1`
			`fi`
- added monitoring of rpc ports for nfs, and of Samba ports and directories - added monitoring of the ethernet link state When monitoring detects an error, the node loses its public IP address (This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501) 2007-06-06 06:08:42 +04:00			`}`

eventscripts: Assert that $service_name is set in a few key places Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d0a7d83ddc824961d876fc9afba829c90aef3e7) 2013-04-29 21:45:21 +04:00			`######################################################`
			`# Ensure $service_name is set`
			`assert_service_name ()`
			`{`
			`[ -n "$service_name" ] \|\| die "INTERNAL ERROR: \$service_name not set"`
			`}`

- added monitoring of rpc ports for nfs, and of Samba ports and directories - added monitoring of the ethernet link state When monitoring detects an error, the node loses its public IP address (This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501) 2007-06-06 06:08:42 +04:00			`######################################################`
			`# check a set of directories is available`
Clean up ctdb_check_directories* eventscript functions. There are 2 problems with this code: * The loop in ctdb_check_directories_probe() breaks on filenames containing whitespace. The fix to protect them is to pass "$@" to this function and have it operate on "$@". Note that there's still a problem with whitespace in filenames in the 50.samba eventscript. To fix this ctdb_check_directories_probe should read the filenames from stdin. Another time... * The check for '%' in filenames in ctdb_check_directories_probe() ends up involving several forks. On a modern machine this can cost a couple of minutes when checking a large number of directories. The fix is to use a case statement. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit eb1fecaef9aa5cb85dff7d4f7af8a9878deabed8) 2009-10-12 09:32:49 +04:00			`# return 1 on a missing directory`
eventscripts; Cleanup up ctdb_check_directories() The documentation comments are wrong... and remove option $service_name argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9e6cb945c5edac9ca6405c9228bf647fab814f5) 2013-04-29 21:48:51 +04:00			`# directories are read from stdin`
- added monitoring of rpc ports for nfs, and of Samba ports and directories - added monitoring of the ethernet link state When monitoring detects an error, the node loses its public IP address (This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501) 2007-06-06 06:08:42 +04:00			`######################################################`
eventscripts; Cleanup up ctdb_check_directories() The documentation comments are wrong... and remove option $service_name argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9e6cb945c5edac9ca6405c9228bf647fab814f5) 2013-04-29 21:48:51 +04:00			`ctdb_check_directories_probe()`
			`{`
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`while IFS="" read d ; do`
			`case "$d" in`
			`%)`
			`continue`
			`;;`
			`*)`
functions: when checking for a directory also check whether it can be accessed. Thanks to "waKKu" on irc for this improvement. Michael (This used to be ctdb commit 81e1483dd0ce2cd091721e456c0c194cc58442f3) 2010-03-26 18:40:00 +03:00			`[ -d "${d}/." ] \|\| return 1`
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`esac`
			`done`
allow for probing of directories without raising an error (This used to be ctdb commit 8fed021d11160b137f4140ea02947347250e2959) 2008-07-23 09:35:46 +04:00			`}`

			`######################################################`
			`# check a set of directories is available`
eventscripts; Cleanup up ctdb_check_directories() The documentation comments are wrong... and remove option $service_name argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9e6cb945c5edac9ca6405c9228bf647fab814f5) 2013-04-29 21:48:51 +04:00			`# directories are read from stdin`
allow for probing of directories without raising an error (This used to be ctdb commit 8fed021d11160b137f4140ea02947347250e2959) 2008-07-23 09:35:46 +04:00			`######################################################`
eventscripts; Cleanup up ctdb_check_directories() The documentation comments are wrong... and remove option $service_name argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9e6cb945c5edac9ca6405c9228bf647fab814f5) 2013-04-29 21:48:51 +04:00			`ctdb_check_directories()`
			`{`
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`ctdb_check_directories_probe \|\| {`
eventscripts; Cleanup up ctdb_check_directories() The documentation comments are wrong... and remove option $service_name argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9e6cb945c5edac9ca6405c9228bf647fab814f5) 2013-04-29 21:48:51 +04:00			`echo "ERROR: $service_name directory \"$d\" not available"`
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`exit 1`
			`}`
- added monitoring of rpc ports for nfs, and of Samba ports and directories - added monitoring of the ethernet link state When monitoring detects an error, the node loses its public IP address (This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501) 2007-06-06 06:08:42 +04:00			`}`

			`######################################################`
			`# check a set of tcp ports`
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`# usage: ctdb_check_tcp_ports <ports...>`
- added monitoring of rpc ports for nfs, and of Samba ports and directories - added monitoring of the ethernet link state When monitoring detects an error, the node loses its public IP address (This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501) 2007-06-06 06:08:42 +04:00			`######################################################`
Eventscripts - weaken TCP port check message if CTDB has just been started. Sometimes smbd and other services can take a while to start, especially when there is a lot of activity after ctdbd has just started. The TCP port check can then pollute the logs with lots of "ERROR" messages and possibly extra debug. This creates a flag file when a service is started (but not restarted) and this flag is removed the first time that TCP port checks succeed for that service. When a port check fails and the flag file still exists, a less extreme "INFO" message is printed rather than the usual "ERROR" message. This means that until the node actually becomes healthy we see more friendly messages. The subtext is that we're hearing false positive reports "recreates" of CQ S1024874 (samba stopped responding on port 445) quite often when ctdbd is started. This reduces the chances of people reporting such false recreates... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 571865eb6ef847857129d0b1e2ba5fa7254bfe8c) 2011-08-05 10:39:57 +04:00
			`# This flag file is created when a service is initially started. It`
			`# is deleted the first time TCP port checks for that service succeed.`
			`# Until then ctdb_check_tcp_ports() prints a more subtle "error"`
			`# message if a port check fails.`
			`_ctdb_check_tcp_common ()`
			`{`
eventscripts: Assert that $service_name is set in a few key places Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d0a7d83ddc824961d876fc9afba829c90aef3e7) 2013-04-29 21:45:21 +04:00			`assert_service_name`
Eventscripts - weaken TCP port check message if CTDB has just been started. Sometimes smbd and other services can take a while to start, especially when there is a lot of activity after ctdbd has just started. The TCP port check can then pollute the logs with lots of "ERROR" messages and possibly extra debug. This creates a flag file when a service is started (but not restarted) and this flag is removed the first time that TCP port checks succeed for that service. When a port check fails and the flag file still exists, a less extreme "INFO" message is printed rather than the usual "ERROR" message. This means that until the node actually becomes healthy we see more friendly messages. The subtext is that we're hearing false positive reports "recreates" of CQ S1024874 (samba stopped responding on port 445) quite often when ctdbd is started. This reduces the chances of people reporting such false recreates... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 571865eb6ef847857129d0b1e2ba5fa7254bfe8c) 2011-08-05 10:39:57 +04:00			`_ctdb_service_started_file="$ctdb_fail_dir/$service_name.started"`
			`}`

			`ctdb_check_tcp_init ()`
			`{`
			`_ctdb_check_tcp_common`
			`mkdir -p "${_ctdb_service_started_file%/*}" # dirname`
			`touch "$_ctdb_service_started_file"`
			`}`

eventscripts: Fold ctdb_check_tcp_ports_ctdb() into ctdb_check_tcp_ports() A generic framework is no longer needed now that the "ctdb" checker is the only one left. Simplify the code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 044d302b41a2040642355401e3236fcecc3a620a) 2013-10-17 08:23:35 +04:00			`# Check whether something is listening on all of the given TCP ports`
			`# using the "ctdb checktcpport" command.`
Eventscript functions: optimise ctdb_check_tcp_ports() and add debug. ctdb_check_tcp_ports() runs "netstat -a -t -n" in a loop for each port. There are 2 problems with this: * Netstat is run on each loop iteration when it need only be run once. * The -a option is used to list all connections but the function only cares about the listening ports. There may be many thousands of non-listening ports to grep through. This changes ctdb_check_tcp_ports() to run netstat with the -l option instead of the -a option. It also only runs netstat once before the main loop. When a port is found to not be listening the output of the netstat command is now dumped to help with debugging. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 830355a8b18c53cfcc3ad1e3009bbb1a7a681fa0) 2011-07-05 05:32:06 +04:00			`ctdb_check_tcp_ports()`
			`{`
Eventscripts - generalise TCP port checking plus new nmap-based checker Split the netstat-specific parts of ctdb_check_tcp_ports() into new function ctdb_check_tcp_ports_netstat(). Implement new ctdb_check_tcp_ports_nmap() function that uses "nmap -PS" to check if the desired ports are listening. ctdb_check_ctdb_ports() now uses new configuration variable CTDB_TCP_PORT_CHECKERS to decide which port checkers to try. Default value is currently "nmap netstat". If nmap is not found then this will fall back to netstat - if logging is at debug level this will also fill the logs with message saying the nmap checker failed. This indicates that either nmap should be installed or the default value of CTDB_TCP_PORT_CHECKERS should be changed (in a configuration file) to avoid trying to use nmap. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9651175b40b9454e7d4e98291955fcf1445085e) 2011-08-17 06:12:20 +04:00			`if [ -z "$1" ] ; then`
			`echo "INTERNAL ERROR: ctdb_check_tcp_ports - no ports specified"`
			`exit 1`
			`fi`

eventscripts: Fold ctdb_check_tcp_ports_ctdb() into ctdb_check_tcp_ports() A generic framework is no longer needed now that the "ctdb" checker is the only one left. Simplify the code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 044d302b41a2040642355401e3236fcecc3a620a) 2013-10-17 08:23:35 +04:00			`for _p ; do # process each function argument (port)`
			`_cmd="ctdb checktcpport $_p"`
			`_out=$($_cmd 2>&1)`
			`_ret=$?`
			`case "$_ret" in`
Eventscripts - generalise TCP port checking plus new nmap-based checker Split the netstat-specific parts of ctdb_check_tcp_ports() into new function ctdb_check_tcp_ports_netstat(). Implement new ctdb_check_tcp_ports_nmap() function that uses "nmap -PS" to check if the desired ports are listening. ctdb_check_ctdb_ports() now uses new configuration variable CTDB_TCP_PORT_CHECKERS to decide which port checkers to try. Default value is currently "nmap netstat". If nmap is not found then this will fall back to netstat - if logging is at debug level this will also fill the logs with message saying the nmap checker failed. This indicates that either nmap should be installed or the default value of CTDB_TCP_PORT_CHECKERS should be changed (in a configuration file) to avoid trying to use nmap. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9651175b40b9454e7d4e98291955fcf1445085e) 2011-08-17 06:12:20 +04:00			`0)`
			`_ctdb_check_tcp_common`
			`if [ ! -f "$_ctdb_service_started_file" ] ; then`
			`echo "ERROR: $service_name tcp port $_p is not responding"`
eventscripts: Fold ctdb_check_tcp_ports_ctdb() into ctdb_check_tcp_ports() A generic framework is no longer needed now that the "ctdb" checker is the only one left. Simplify the code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 044d302b41a2040642355401e3236fcecc3a620a) 2013-10-17 08:23:35 +04:00			`debug "\"ctdb checktcpport $_p\" was able to bind to port"`
Eventscripts - generalise TCP port checking plus new nmap-based checker Split the netstat-specific parts of ctdb_check_tcp_ports() into new function ctdb_check_tcp_ports_netstat(). Implement new ctdb_check_tcp_ports_nmap() function that uses "nmap -PS" to check if the desired ports are listening. ctdb_check_ctdb_ports() now uses new configuration variable CTDB_TCP_PORT_CHECKERS to decide which port checkers to try. Default value is currently "nmap netstat". If nmap is not found then this will fall back to netstat - if logging is at debug level this will also fill the logs with message saying the nmap checker failed. This indicates that either nmap should be installed or the default value of CTDB_TCP_PORT_CHECKERS should be changed (in a configuration file) to avoid trying to use nmap. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9651175b40b9454e7d4e98291955fcf1445085e) 2011-08-17 06:12:20 +04:00			`else`
			`echo "INFO: $service_name tcp port $_p is not responding"`
			`fi`

Eventscripts - new default TCP port checker using "ctdb checktcpport" New function ctdb_check_tcp_ports_ctdb(). This should be fast... and is now the default checker. If it fails in an unexpected way we fall back to the nmap and netstat checkers. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a1e16a707ce204817531a61455000361f972080a) 2011-08-17 08:02:45 +04:00			`return 1`
			`;;`
			`98)`
			`# Couldn't bind, something already listening, next port...`
			`continue`
			`;;`
			`*)`
eventscripts: Fold ctdb_check_tcp_ports_ctdb() into ctdb_check_tcp_ports() A generic framework is no longer needed now that the "ctdb" checker is the only one left. Simplify the code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 044d302b41a2040642355401e3236fcecc3a620a) 2013-10-17 08:23:35 +04:00			`echo "ERROR: unexpected error running \"ctdb checktcpport\""`
			`debug <<EOF`
			`ctdb checktcpport (exited with $_ret) with output:`
Eventscripts - new default TCP port checker using "ctdb checktcpport" New function ctdb_check_tcp_ports_ctdb(). This should be fast... and is now the default checker. If it fails in an unexpected way we fall back to the nmap and netstat checkers. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a1e16a707ce204817531a61455000361f972080a) 2011-08-17 08:02:45 +04:00			`$_out"`
eventscripts: Fold ctdb_check_tcp_ports_ctdb() into ctdb_check_tcp_ports() A generic framework is no longer needed now that the "ctdb" checker is the only one left. Simplify the code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 044d302b41a2040642355401e3236fcecc3a620a) 2013-10-17 08:23:35 +04:00			`EOF`
			`return $_ret`
Eventscripts - new default TCP port checker using "ctdb checktcpport" New function ctdb_check_tcp_ports_ctdb(). This should be fast... and is now the default checker. If it fails in an unexpected way we fall back to the nmap and netstat checkers. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a1e16a707ce204817531a61455000361f972080a) 2011-08-17 08:02:45 +04:00			`esac`
			`done`

eventscripts: Fold ctdb_check_tcp_ports_ctdb() into ctdb_check_tcp_ports() A generic framework is no longer needed now that the "ctdb" checker is the only one left. Simplify the code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 044d302b41a2040642355401e3236fcecc3a620a) 2013-10-17 08:23:35 +04:00			`# All ports listening`
			`_ctdb_check_tcp_common`
			`rm -f "$_ctdb_service_started_file"`
Eventscripts - new default TCP port checker using "ctdb checktcpport" New function ctdb_check_tcp_ports_ctdb(). This should be fast... and is now the default checker. If it fails in an unexpected way we fall back to the nmap and netstat checkers. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a1e16a707ce204817531a61455000361f972080a) 2011-08-17 08:02:45 +04:00			`return 0`
			`}`

From : Flavio Carmo Junior <carmo.flavio@gmail.com> Add a helper function that checks whether a unix domain socket exists and there is a daemon LISTENING to it similar to the existing function to check for a daemon LISTENING to a tcp/ip socket. (This used to be ctdb commit 025a836ab3be3c078fccd8c10b10dfffbfdd94d0) 2009-05-19 02:47:19 +04:00			`######################################################`
			`# check a unix socket`
			`# usage: ctdb_check_unix_socket SERVICE_NAME <socket_path>`
			`######################################################`
			`ctdb_check_unix_socket() {`
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`socket_path="$1"`
			`[ -z "$socket_path" ] && return`
From : Flavio Carmo Junior <carmo.flavio@gmail.com> Add a helper function that checks whether a unix domain socket exists and there is a daemon LISTENING to it similar to the existing function to check for a daemon LISTENING to a tcp/ip socket. (This used to be ctdb commit 025a836ab3be3c078fccd8c10b10dfffbfdd94d0) 2009-05-19 02:47:19 +04:00
More eventscript cleanups. Initial smoke testing seems OK. Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50) 2009-11-20 08:45:36 +03:00			`if ! netstat --unix -a -n \| grep -q "^unix.LISTEN.${socket_path}$"; then`
			`echo "ERROR: $service_name socket $socket_path not found"`
			`return 1`
From : Flavio Carmo Junior <carmo.flavio@gmail.com> Add a helper function that checks whether a unix domain socket exists and there is a daemon LISTENING to it similar to the existing function to check for a daemon LISTENING to a tcp/ip socket. (This used to be ctdb commit 025a836ab3be3c078fccd8c10b10dfffbfdd94d0) 2009-05-19 02:47:19 +04:00			`fi`
			`}`

check winbind in monitoring event too (This used to be ctdb commit bccba656c21d0edbd9840401a3c43a76b1b3bc05) 2007-06-17 06:05:29 +04:00			`######################################################`
			`# check a command returns zero status`
eventscripts: Clean up ctdb_check_command() * Command is now multiple arguments, preserving quoting * $service_name no longer printed, no longer an argument * Debug output from failed command Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9e25fb261447a196de05937052779b36e75e7215) 2013-04-29 21:54:17 +04:00			`# usage: ctdb_check_command <command>`
check winbind in monitoring event too (This used to be ctdb commit bccba656c21d0edbd9840401a3c43a76b1b3bc05) 2007-06-17 06:05:29 +04:00			`######################################################`
eventscripts: Clean up ctdb_check_command() * Command is now multiple arguments, preserving quoting * $service_name no longer printed, no longer an argument * Debug output from failed command Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9e25fb261447a196de05937052779b36e75e7215) 2013-04-29 21:54:17 +04:00			`ctdb_check_command ()`
			`{`
			`_out=$("$@" 2>&1) \|\| {`
			`echo "ERROR: $* returned error"`
			`echo "$_out" \| debug`
			`exit 1`
			`}`
check winbind in monitoring event too (This used to be ctdb commit bccba656c21d0edbd9840401a3c43a76b1b3bc05) 2007-06-17 06:05:29 +04:00			`}`
move the kill_tcp_connections() function from 10.interfaces to functions (This used to be ctdb commit 055948530fb16bf49c42fc4489f29a21665156c0) 2007-10-11 01:27:38 +04:00
			`################################################`
			`# kill off any TCP connections with the given IP`
			`################################################`
eventscripts: Minor cleanups for killtcp/tickle functions Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25ef4f655f1efc833deb5e244f9fff461e92f439) 2013-05-06 10:23:25 +04:00			`kill_tcp_connections ()`
			`{`
			`_ip="$1"`
eventscripts: Reimplement kill_tcp_connections_local_only() ... using kill_tcp_connections() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 10e4db8f796d1e3259733180494db3b4bbad291a) 2013-04-30 00:19:18 +04:00
			`_oneway=false`
			`if [ "$2" = "oneway" ] ; then`
			`_oneway=true`
			`fi`

eventscripts: Minor cleanups for killtcp/tickle functions Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25ef4f655f1efc833deb5e244f9fff461e92f439) 2013-05-06 10:23:25 +04:00			`get_tcp_connections_for_ip "$_ip" \| {`
			`_killcount=0`
eventscripts: kill_tcp_connections() should send connections to stdin This avoids issuing multiple "ctdb killtcp" commands to terminate tcp connections, one per connection. This will considerably reduce the time when there is a large number of tcp connections. This also makes it possible to avoid calling "ctdb killtcp" when there are no connections. Add a couple of unit tests for killtcp and update eventscript unit test infrastructure to support. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a20d94717d2e4ab866d8a002cdf39c0669b74c6a) 2013-07-25 07:40:43 +04:00			`_connections=""`
			`_nl="`
			`"`
			`while read _dst _src; do`
			`_destport="${_dst##*:}"`
eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`__oneway=$_oneway`
			`case $_destport in`
			`# we only do one-way killtcp for CIFS`
			`139\|445) __oneway=true ;;`
			`esac`
eventscripts: kill_tcp_connections() should send connections to stdin This avoids issuing multiple "ctdb killtcp" commands to terminate tcp connections, one per connection. This will considerably reduce the time when there is a large number of tcp connections. This also makes it possible to avoid calling "ctdb killtcp" when there are no connections. Add a couple of unit tests for killtcp and update eventscript unit test infrastructure to support. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a20d94717d2e4ab866d8a002cdf39c0669b74c6a) 2013-07-25 07:40:43 +04:00
			`echo "Killing TCP connection $_src $_dst"`
			`_connections="${_connections}${_nl}${_src} ${_dst}"`
eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`if ! $__oneway ; then`
eventscripts: kill_tcp_connections() should send connections to stdin This avoids issuing multiple "ctdb killtcp" commands to terminate tcp connections, one per connection. This will considerably reduce the time when there is a large number of tcp connections. This also makes it possible to avoid calling "ctdb killtcp" when there are no connections. Add a couple of unit tests for killtcp and update eventscript unit test infrastructure to support. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a20d94717d2e4ab866d8a002cdf39c0669b74c6a) 2013-07-25 07:40:43 +04:00			`_connections="${_connections}${_nl}${_dst} ${_src}"`
eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`fi`
update the socketkiller in the eventscripts to be able to handle ipv6 (This used to be ctdb commit 6da7b36b7ccc4ee9b809867ea32036f09a801bb3) 2008-08-20 03:47:00 +04:00
eventscripts: Minor cleanups for killtcp/tickle functions Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25ef4f655f1efc833deb5e244f9fff461e92f439) 2013-05-06 10:23:25 +04:00			`_killcount=$(($_killcount + 1))`
eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`done`

eventscripts: Minor cleanups for killtcp/tickle functions Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25ef4f655f1efc833deb5e244f9fff461e92f439) 2013-05-06 10:23:25 +04:00			`if [ $_killcount -eq 0 ] ; then`
			`return`
			`fi`
eventscripts: Tweak the timeout check in kill_tcp_connections() This has 2 advantages: 1. It uses get_tcp_connections_for_ip() to check for leftover connections, instead of custom code. 2. It checks for the timeout condition before sleeping. The current code sleeps and then checks, so wastes a second. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 60a08eb96e1d97aab31e9bd4af01683c650541c2) 2013-04-30 05:39:46 +04:00
eventscripts: kill_tcp_connections() should send connections to stdin This avoids issuing multiple "ctdb killtcp" commands to terminate tcp connections, one per connection. This will considerably reduce the time when there is a large number of tcp connections. This also makes it possible to avoid calling "ctdb killtcp" when there are no connections. Add a couple of unit tests for killtcp and update eventscript unit test infrastructure to support. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a20d94717d2e4ab866d8a002cdf39c0669b74c6a) 2013-07-25 07:40:43 +04:00			`echo "$_connections" \| ctdb killtcp \|\| {`
			`echo "Failed to send killtcp control"`
			`return`
			`}`

eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`_count=0`
eventscripts: Tweak the timeout check in kill_tcp_connections() This has 2 advantages: 1. It uses get_tcp_connections_for_ip() to check for leftover connections, instead of custom code. 2. It checks for the timeout condition before sleeping. The current code sleeps and then checks, so wastes a second. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 60a08eb96e1d97aab31e9bd4af01683c650541c2) 2013-04-30 05:39:46 +04:00			`while : ; do`
eventscripts: Print a message when waiting for TCP connections to be killed This makes the gaps in the logs more obvious. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 11fbf4789d783dd0bac22754b374dd9ea4b03bad) 2013-08-06 06:42:13 +04:00			`_remaining=$(get_tcp_connections_for_ip $_ip \| wc -l)`

			`if [ $_remaining -eq 0 ] ; then`
eventscripts: Minor cleanups for killtcp/tickle functions Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25ef4f655f1efc833deb5e244f9fff461e92f439) 2013-05-06 10:23:25 +04:00			`echo "Killed $_killcount TCP connections to released IP $_ip"`
eventscripts: Tweak the timeout check in kill_tcp_connections() This has 2 advantages: 1. It uses get_tcp_connections_for_ip() to check for leftover connections, instead of custom code. 2. It checks for the timeout condition before sleeping. The current code sleeps and then checks, so wastes a second. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 60a08eb96e1d97aab31e9bd4af01683c650541c2) 2013-04-30 05:39:46 +04:00			`return`
			`fi`

			`_count=$(($_count + 1))`
			`if [ $_count -gt 3 ] ; then`
ctdb/eventscripts: Print a count if killing TCP connections times out Also update related test Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-01-13 09:34:50 +04:00			`echo "Timed out killing tcp connections for IP $_ip ($_remaining remaining)"`
eventscripts: Tweak the timeout check in kill_tcp_connections() This has 2 advantages: 1. It uses get_tcp_connections_for_ip() to check for leftover connections, instead of custom code. 2. It checks for the timeout condition before sleeping. The current code sleeps and then checks, so wastes a second. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 60a08eb96e1d97aab31e9bd4af01683c650541c2) 2013-04-30 05:39:46 +04:00			`return`
			`fi`

eventscripts: Print a message when waiting for TCP connections to be killed This makes the gaps in the logs more obvious. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 11fbf4789d783dd0bac22754b374dd9ea4b03bad) 2013-08-06 06:42:13 +04:00			`echo "Waiting for $_remaining connections to be killed for IP $_ip"`
eventscripts: Tweak the timeout check in kill_tcp_connections() This has 2 advantages: 1. It uses get_tcp_connections_for_ip() to check for leftover connections, instead of custom code. 2. It checks for the timeout condition before sleeping. The current code sleeps and then checks, so wastes a second. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 60a08eb96e1d97aab31e9bd4af01683c650541c2) 2013-04-30 05:39:46 +04:00			`sleep 1`
eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`done`
			`}`
create a varient of kill_tcp_connections that only kills off the local side of a connection (This used to be ctdb commit dc2f28f7c988364b5d45f3048be4db3e5ff113b3) 2009-03-24 06:05:31 +03:00			`}`

			`##################################################################`
			`# kill off the local end for any TCP connections with the given IP`
			`##################################################################`
eventscripts: Minor cleanups for killtcp/tickle functions Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25ef4f655f1efc833deb5e244f9fff461e92f439) 2013-05-06 10:23:25 +04:00			`kill_tcp_connections_local_only ()`
eventscripts: Reimplement kill_tcp_connections_local_only() ... using kill_tcp_connections() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 10e4db8f796d1e3259733180494db3b4bbad291a) 2013-04-30 00:19:18 +04:00			`{`
			`kill_tcp_connections "$1" "oneway"`
move the kill_tcp_connections() function from 10.interfaces to functions (This used to be ctdb commit 055948530fb16bf49c42fc4489f29a21665156c0) 2007-10-11 01:27:38 +04:00			`}`

config/functions: add tickle_tcp_connections() metze (This used to be ctdb commit 2397f13d7b5ca3847ef148187c6b179d06f6a47a) 2009-12-18 11:43:20 +03:00			`##################################################################`
			`# tickle any TCP connections with the given IP`
			`##################################################################`
eventscripts: Minor cleanups for killtcp/tickle functions Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25ef4f655f1efc833deb5e244f9fff461e92f439) 2013-05-06 10:23:25 +04:00			`tickle_tcp_connections ()`
			`{`
			`_ip="$1"`
config/functions: add tickle_tcp_connections() metze (This used to be ctdb commit 2397f13d7b5ca3847ef148187c6b179d06f6a47a) 2009-12-18 11:43:20 +03:00
eventscripts: Minor cleanups for killtcp/tickle functions Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25ef4f655f1efc833deb5e244f9fff461e92f439) 2013-05-06 10:23:25 +04:00			`get_tcp_connections_for_ip "$_ip" \|`
eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`{`
eventscripts: In killtcp/tickle functions, $_failed should be boolean Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 319c1b68d5aa78f82a68febcad233a7c78afc887) 2013-04-30 00:31:30 +04:00			`_failed=false`

eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`while read dest src; do`
			`echo "Tickle TCP connection $src $dest"`
eventscripts: In killtcp/tickle functions, $_failed should be boolean Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 319c1b68d5aa78f82a68febcad233a7c78afc887) 2013-04-30 00:31:30 +04:00			`ctdb tickle $src $dest >/dev/null 2>&1 \|\| _failed=true`
eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`echo "Tickle TCP connection $dest $src"`
eventscripts: In killtcp/tickle functions, $_failed should be boolean Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 319c1b68d5aa78f82a68febcad233a7c78afc887) 2013-04-30 00:31:30 +04:00			`ctdb tickle $dest $src >/dev/null 2>&1 \|\| _failed=true`
eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`done`

eventscripts: In killtcp/tickle functions, $_failed should be boolean Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 319c1b68d5aa78f82a68febcad233a7c78afc887) 2013-04-30 00:31:30 +04:00			`if $_failed ; then`
eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`echo "Failed to send tickle control"`
eventscripts: In killtcp/tickle functions, $_failed should be boolean Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 319c1b68d5aa78f82a68febcad233a7c78afc887) 2013-04-30 00:31:30 +04:00			`fi`
config/functions: add tickle_tcp_connections() metze (This used to be ctdb commit 2397f13d7b5ca3847ef148187c6b179d06f6a47a) 2009-12-18 11:43:20 +03:00			`}`
			`}`

eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1) 2013-04-30 00:25:26 +04:00			`get_tcp_connections_for_ip ()`
			`{`
			`_ip="$1"`

			`netstat -tn \| awk -v ip=$_ip \`
			`'index($1, "tcp") == 1 && \`
			`(index($4, ip ":") == 1 \|\| index($4, "::ffff:" ip ":") == 1) \`
			`&& $6 == "ESTABLISHED" \`
			`{print $4" "$5}'`
			`}`

create a startstop_nfs function that can start/stop the nfs service of different platforms (This used to be ctdb commit f6cc6bd1f62138fbf812d1917f7341e2fa2323da) 2008-02-11 01:35:37 +03:00			`########################################################`
Eventscripts: Modernise 60.ganesha to match 60.nfs Originally from Srikrishan Malik <srikrishan.malik@in.ibm.com> with some style changes by me. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 637cab6304dae66b85668506028c76ea1ee88980) 2012-05-16 11:24:21 +04:00			`# start/stop the Ganesha nfs service`
			`########################################################`
			`startstop_ganesha()`
			`{`
Changes for unobtrusive recovery and new method for health check. Unobtrusive recovery: Ganesha will not be restarted on failovers. Ganesha health: Use the counters in /var/lib/nfs/ganesha_local to track progress instead of the null call which can timeout if the server is too busy. Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com> Signed-off-by: Lance Russell <lancerus@us.ibm.com> (This used to be ctdb commit 0e651e9da0f1f3c836b4474612ab13d0ccd272d9) 2013-01-09 14:41:39 +04:00			`_service_name="nfs-ganesha-$CTDB_CLUSTER_FILESYSTEM_TYPE"`
Eventscripts: Modernise 60.ganesha to match 60.nfs Originally from Srikrishan Malik <srikrishan.malik@in.ibm.com> with some style changes by me. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 637cab6304dae66b85668506028c76ea1ee88980) 2012-05-16 11:24:21 +04:00			`case "$1" in`
			`start)`
			`service "$_service_name" start`
			`;;`
			`stop)`
			`service "$_service_name" stop`
			`;;`
			`restart)`
ctdb-scripts: Add rpc.statd stack dumping to Ganesha restart Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-14 08:39:07 +03:00			`service "$_service_name" stop`
			`nfs_dump_some_threads "rpc.statd"`
			`service "$_service_name" start`
Eventscripts: Modernise 60.ganesha to match 60.nfs Originally from Srikrishan Malik <srikrishan.malik@in.ibm.com> with some style changes by me. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 637cab6304dae66b85668506028c76ea1ee88980) 2012-05-16 11:24:21 +04:00			`;;`
			`esac`
			`}`

			`########################################################`
create a startstop_nfs function that can start/stop the nfs service of different platforms (This used to be ctdb commit f6cc6bd1f62138fbf812d1917f7341e2fa2323da) 2008-02-11 01:35:37 +03:00			`# start/stop the nfs service on different platforms`
			`########################################################`
			`startstop_nfs() {`
			`PLATFORM="unknown"`
Eventscript functions: add $CTDB_ETCDIR and hook service() functions. * $CTDB_ETCDIR defaults to /etc but can be changed for testing. All hard-coded instances of /etc have been changed to $CTDB_ETCDIR. This includes references to /etc/init.d and /etc/sysconfig. * service() and nice_service() functions now call new function _service(). This makes it easier to override these functions (say, in rc.local) for testing and call most of the existing functionality using _service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63) 2011-06-07 09:57:29 +04:00			`[ -x $CTDB_ETCDIR/init.d/nfsserver ] && {`
create a startstop_nfs function that can start/stop the nfs service of different platforms (This used to be ctdb commit f6cc6bd1f62138fbf812d1917f7341e2fa2323da) 2008-02-11 01:35:37 +03:00			`PLATFORM="sles"`
			`}`
ctdb-scripts: Support NFS on RHEL7 with systemd Need to be able to recognise a RHEL system. Still use "system" to start and stop service, since that still works and yields the smallest change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-06-26 04:36:17 +04:00			`[ -x $CTDB_ETCDIR/init.d/nfslock -o \`
			`-r /usr/lib/systemd/system/nfs-lock.service ] && {`
create a startstop_nfs function that can start/stop the nfs service of different platforms (This used to be ctdb commit f6cc6bd1f62138fbf812d1917f7341e2fa2323da) 2008-02-11 01:35:37 +03:00			`PLATFORM="rhel"`
			`}`

			`case $PLATFORM in`
			`sles)`
			`case $1 in`
			`start)`
			`service nfsserver start`
			`;;`
			`stop)`
			`service nfsserver stop > /dev/null 2>&1`
			`;;`
On RHEL, "service nfs stop;service nfs start" and "service nfs restart" sometimes (very rarely) fails to restart the service. Add a function to restart NFSd on SLES and RHEL-like systems. If we detect the system is unhealthy due to kNFSd not running, try to restart the service again "service nfs restart" and hope for the best. CQ1019372 (This used to be ctdb commit 25c4ce7e919f13226219f036bcffd2be76b2f06c) 2010-08-19 01:18:22 +04:00			`restart)`
Eventscripts: use set_proc() in startstop_nfs(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5a3d5c6b1ca3682bb45104e50061871dec6e9b1d) 2011-06-28 08:58:13 +04:00			`set_proc "fs/nfsd/threads" 0`
Eventscripts: work around NFS restart failure under load. "service nfs restart" can fail. To stop nfsd it sends a SIGINT and nfsd might take a while to process it if the system is loaded. Starting nfsd may then fail because resources are still in use. This does some /proc magic to tell nfsd to do no more processing. It then runs service stop, kills nfsd with SIGKILL, and then runs service start. This is much less likely to fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a9bf4f82852975b0b627f61ceb2d23401f630805) 2011-01-11 09:06:48 +03:00			`service nfsserver stop > /dev/null 2>&1`
			`pkill -9 nfsd`
eventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADS If some nfsd threads are still alive after a shutdown during a restart then this indicates the maximum number of threads for which a stack trace should be dumped. This can be useful for trying to determine why nfsd is stuck. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2503245db10d567af708a04edd3a3b488c24f401) 2013-06-13 05:56:25 +04:00			`nfs_dump_some_threads`
Eventscripts: work around NFS restart failure under load. "service nfs restart" can fail. To stop nfsd it sends a SIGINT and nfsd might take a while to process it if the system is loaded. Starting nfsd may then fail because resources are still in use. This does some /proc magic to tell nfsd to do no more processing. It then runs service stop, kills nfsd with SIGKILL, and then runs service start. This is much less likely to fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a9bf4f82852975b0b627f61ceb2d23401f630805) 2011-01-11 09:06:48 +03:00			`service nfsserver start`
On RHEL, "service nfs stop;service nfs start" and "service nfs restart" sometimes (very rarely) fails to restart the service. Add a function to restart NFSd on SLES and RHEL-like systems. If we detect the system is unhealthy due to kNFSd not running, try to restart the service again "service nfs restart" and hope for the best. CQ1019372 (This used to be ctdb commit 25c4ce7e919f13226219f036bcffd2be76b2f06c) 2010-08-19 01:18:22 +04:00			`;;`
create a startstop_nfs function that can start/stop the nfs service of different platforms (This used to be ctdb commit f6cc6bd1f62138fbf812d1917f7341e2fa2323da) 2008-02-11 01:35:37 +03:00			`esac`
			`;;`
			`rhel)`
			`case $1 in`
			`start)`
			`service nfslock start`
			`service nfs start`
			`;;`
			`stop)`
Eventscripts: startstop_nfs stop no longer redirects output to /dev/null. When stopping (as opposed to restarting) it is useful to see this information. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a9ab1937239761dc32b143c9d225447bc6f090b4) 2011-01-14 01:31:05 +03:00			`service nfs stop`
			`service nfslock stop`
create a startstop_nfs function that can start/stop the nfs service of different platforms (This used to be ctdb commit f6cc6bd1f62138fbf812d1917f7341e2fa2323da) 2008-02-11 01:35:37 +03:00			`;;`
On RHEL, "service nfs stop;service nfs start" and "service nfs restart" sometimes (very rarely) fails to restart the service. Add a function to restart NFSd on SLES and RHEL-like systems. If we detect the system is unhealthy due to kNFSd not running, try to restart the service again "service nfs restart" and hope for the best. CQ1019372 (This used to be ctdb commit 25c4ce7e919f13226219f036bcffd2be76b2f06c) 2010-08-19 01:18:22 +04:00			`restart)`
Eventscripts: use set_proc() in startstop_nfs(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5a3d5c6b1ca3682bb45104e50061871dec6e9b1d) 2011-06-28 08:58:13 +04:00			`set_proc "fs/nfsd/threads" 0`
Eventscripts: work around NFS restart failure under load. "service nfs restart" can fail. To stop nfsd it sends a SIGINT and nfsd might take a while to process it if the system is loaded. Starting nfsd may then fail because resources are still in use. This does some /proc magic to tell nfsd to do no more processing. It then runs service stop, kills nfsd with SIGKILL, and then runs service start. This is much less likely to fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a9bf4f82852975b0b627f61ceb2d23401f630805) 2011-01-11 09:06:48 +03:00			`service nfs stop > /dev/null 2>&1`
			`service nfslock stop > /dev/null 2>&1`
			`pkill -9 nfsd`
eventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADS If some nfsd threads are still alive after a shutdown during a restart then this indicates the maximum number of threads for which a stack trace should be dumped. This can be useful for trying to determine why nfsd is stuck. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2503245db10d567af708a04edd3a3b488c24f401) 2013-06-13 05:56:25 +04:00			`nfs_dump_some_threads`
Eventscripts: work around NFS restart failure under load. "service nfs restart" can fail. To stop nfsd it sends a SIGINT and nfsd might take a while to process it if the system is loaded. Starting nfsd may then fail because resources are still in use. This does some /proc magic to tell nfsd to do no more processing. It then runs service stop, kills nfsd with SIGKILL, and then runs service start. This is much less likely to fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a9bf4f82852975b0b627f61ceb2d23401f630805) 2011-01-11 09:06:48 +03:00			`service nfslock start`
			`service nfs start`
On RHEL, "service nfs stop;service nfs start" and "service nfs restart" sometimes (very rarely) fails to restart the service. Add a function to restart NFSd on SLES and RHEL-like systems. If we detect the system is unhealthy due to kNFSd not running, try to restart the service again "service nfs restart" and hope for the best. CQ1019372 (This used to be ctdb commit 25c4ce7e919f13226219f036bcffd2be76b2f06c) 2010-08-19 01:18:22 +04:00			`;;`
create a startstop_nfs function that can start/stop the nfs service of different platforms (This used to be ctdb commit f6cc6bd1f62138fbf812d1917f7341e2fa2323da) 2008-02-11 01:35:37 +03:00			`esac`
			`;;`
			`*)`
			`echo "Unknown platform. NFS is not supported with ctdb"`
			`exit 1`
			`;;`
			`esac`
			`}`
add helpers to stop/start nfs lockmanager on different platforms (This used to be ctdb commit 3b797d851bd4bdb8ec2b3981061c668d2cf0f97c) 2008-02-11 01:52:09 +03:00
eventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADS If some nfsd threads are still alive after a shutdown during a restart then this indicates the maximum number of threads for which a stack trace should be dumped. This can be useful for trying to determine why nfsd is stuck. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2503245db10d567af708a04edd3a3b488c24f401) 2013-06-13 05:56:25 +04:00			`# Dump up to the configured number of nfsd thread backtraces.`
			`nfs_dump_some_threads ()`
			`{`
ctdb-scripts: Add optional program name argument to nfs_dump_some_threads() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-14 05:48:16 +03:00			`_prog="${1:-nfsd}"`

ctdb-scripts: Factor out new function program_stack_traces() In the process, fix a bug where an extra trace would be printed. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-14 05:31:03 +03:00			`_num="${CTDB_NFS_DUMP_STUCK_THREADS:-5}"`
			`[ $_num -gt 0 ] \|\| return 0`
eventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADS If some nfsd threads are still alive after a shutdown during a restart then this indicates the maximum number of threads for which a stack trace should be dumped. This can be useful for trying to determine why nfsd is stuck. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2503245db10d567af708a04edd3a3b488c24f401) 2013-06-13 05:56:25 +04:00
ctdb-scripts: Add optional program name argument to nfs_dump_some_threads() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-14 05:48:16 +03:00			`program_stack_traces "$_prog" $_num`
eventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADS If some nfsd threads are still alive after a shutdown during a restart then this indicates the maximum number of threads for which a stack trace should be dumped. This can be useful for trying to determine why nfsd is stuck. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2503245db10d567af708a04edd3a3b488c24f401) 2013-06-13 05:56:25 +04:00			`}`

add helpers to stop/start nfs lockmanager on different platforms (This used to be ctdb commit 3b797d851bd4bdb8ec2b3981061c668d2cf0f97c) 2008-02-11 01:52:09 +03:00			`########################################################`
			`# start/stop the nfs lockmanager service on different platforms`
			`########################################################`
			`startstop_nfslock() {`
			`PLATFORM="unknown"`
Eventscript functions: add $CTDB_ETCDIR and hook service() functions. * $CTDB_ETCDIR defaults to /etc but can be changed for testing. All hard-coded instances of /etc have been changed to $CTDB_ETCDIR. This includes references to /etc/init.d and /etc/sysconfig. * service() and nice_service() functions now call new function _service(). This makes it easier to override these functions (say, in rc.local) for testing and call most of the existing functionality using _service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63) 2011-06-07 09:57:29 +04:00			`[ -x $CTDB_ETCDIR/init.d/nfsserver ] && {`
add helpers to stop/start nfs lockmanager on different platforms (This used to be ctdb commit 3b797d851bd4bdb8ec2b3981061c668d2cf0f97c) 2008-02-11 01:52:09 +03:00			`PLATFORM="sles"`
			`}`
ctdb-scripts: Support NFS on RHEL7 with systemd Need to be able to recognise a RHEL system. Still use "system" to start and stop service, since that still works and yields the smallest change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-06-26 04:36:17 +04:00			`[ -x $CTDB_ETCDIR/init.d/nfslock -o \`
			`-r /usr/lib/systemd/system/nfs-lock.service ] && {`
add helpers to stop/start nfs lockmanager on different platforms (This used to be ctdb commit 3b797d851bd4bdb8ec2b3981061c668d2cf0f97c) 2008-02-11 01:52:09 +03:00			`PLATFORM="rhel"`
			`}`

			`case $PLATFORM in`
			`sles)`
			`# for sles there is no service for lockmanager`
			`# so we instead just shutdown/restart nfs`
			`case $1 in`
			`start)`
			`service nfsserver start`
			`;;`
			`stop)`
			`service nfsserver stop > /dev/null 2>&1`
			`;;`
try to restart NFS LOCKD if it failed to start (This used to be ctdb commit 2913cc93a9a172caf9e0d6675cfa4de4cc957b13) 2010-10-14 01:12:41 +04:00			`restart)`
eventscripts: When restarting the nfslock service only show output of start That is, /dev/null the "stop" output. This is consistent with the way CTDB generally deals with the output when stopping a service. It also makes updating the eventscript unit tests easier. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c7332526b1b488abefeb4be78a7cd3f2f9abc451) 2013-07-30 10:21:36 +04:00			`service nfsserver stop > /dev/null 2>&1`
try to restart NFS LOCKD if it failed to start (This used to be ctdb commit 2913cc93a9a172caf9e0d6675cfa4de4cc957b13) 2010-10-14 01:12:41 +04:00			`service nfsserver start`
			`;;`
add helpers to stop/start nfs lockmanager on different platforms (This used to be ctdb commit 3b797d851bd4bdb8ec2b3981061c668d2cf0f97c) 2008-02-11 01:52:09 +03:00			`esac`
			`;;`
			`rhel)`
			`case $1 in`
			`start)`
			`service nfslock start`
			`;;`
			`stop)`
			`service nfslock stop > /dev/null 2>&1`
			`;;`
try to restart NFS LOCKD if it failed to start (This used to be ctdb commit 2913cc93a9a172caf9e0d6675cfa4de4cc957b13) 2010-10-14 01:12:41 +04:00			`restart)`
eventscripts: When restarting the nfslock service only show output of start That is, /dev/null the "stop" output. This is consistent with the way CTDB generally deals with the output when stopping a service. It also makes updating the eventscript unit tests easier. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c7332526b1b488abefeb4be78a7cd3f2f9abc451) 2013-07-30 10:21:36 +04:00			`service nfslock stop > /dev/null 2>&1`
try to restart NFS LOCKD if it failed to start (This used to be ctdb commit 2913cc93a9a172caf9e0d6675cfa4de4cc957b13) 2010-10-14 01:12:41 +04:00			`service nfslock start`
			`;;`
add helpers to stop/start nfs lockmanager on different platforms (This used to be ctdb commit 3b797d851bd4bdb8ec2b3981061c668d2cf0f97c) 2008-02-11 01:52:09 +03:00			`esac`
			`;;`
			`*)`
			`echo "Unknown platform. NFS locking is not supported with ctdb"`
			`exit 1`
			`;;`
			`esac`
			`}`
add possibility to provide site local modifications to the event system through a /etc/ctdb/rc.local script that is sources by /etc/ctdb/functions (This used to be ctdb commit a5b7dd97e3faf0c4f289240307d0e22a67cf2353) 2008-04-10 00:50:12 +04:00
eventscripts: Fix statd-callout update handling 60.nfs and 60.ganesha touch $statd_update_trigger every time they're run. This stops the statd-callout updates from ever being called. Make this logic self-contained and move it to new function nfs_statd_update() in the functions file. Call this in 60.nfs and 60.ganesha with the appropriate update period as the only argument. Signed-off-by: Martin Schwenke <martin@meltin.net> Reported-by: Poornima Gupte <poornima.gupte@in.ibm.com> (This used to be ctdb commit 1b5968f6be084590667f4f15ff3bef13ed9a2973) 2013-05-28 06:01:57 +04:00			`# Periodically update the statd database`
			`nfs_statd_update ()`
			`{`
			`_update_period="$1"`

			`_statd_update_trigger="$service_state_dir/update-trigger"`
			`[ -f "$_statd_update_trigger" ] \|\| touch "$_statd_update_trigger"`

			`_last_update=$(stat --printf="%Y" "$_statd_update_trigger")`
			`_current_time=$(date +"%s")`
			`if [ $(( $_current_time - $_last_update)) -ge $_update_period ] ; then`
			`touch "$_statd_update_trigger"`
			`$CTDB_BASE/statd-callout updatelocal &`
			`$CTDB_BASE/statd-callout updateremote &`
			`fi`
			`}`

ctdb-eventscripts: Deleting IPs should use the promote_secondaries option If a primary IP address is being deleted from an interface, the secondaries are remembered and added back after the primary is deleted. This is done under a lock shared by the add/del script code. It is necessary because, by default, Linux deletes secondaries when the corresponding primary is deleted. There is a race here between ctdbd and the scripts, since ctdbd doesn't know about the lock. If ctdbd receives a release IP control and the IP address is not on an interface then it is regarded as a "Redundant release of IP" so no "releaseip" event is generated. This can occur if the IP address in question is a secondary that has been temporarily dropped. It is more likely if the number of secondaries is large. Since Linux 2.6.12 (i.e. 2005) Linux has supported a promote_secondaries option on interfaces. This option is currently undocumented but that will change in Linux 3.14. With promote_secondaries enabled the kernel will not drop secondaries but will promote a corresponding secondary instead. The kernel does all necessary locking. Use promote_secondaries to simplify the code, avoid re-adding secondaries, avoid re-adding routes and provide improved performance. This could be done conditionally, with a fallback to legacy secondary-re-adding code, but no supported Linux distribution is running a pre-2.6.12 kernel so this is unnecessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-01-28 07:41:25 +04:00			`########################################################`

			`add_ip_to_iface ()`
config: add interface_modify.sh and call it under flock to make modification on interfaces atomic When two releaseip events run in parallel it's possible that the 2nd script readds a secondary ip that was removed by the 1st script. metze (This used to be ctdb commit e02417b2a55c45ac2c125b1b3463c9c39e7bc07a) 2010-01-20 13:10:48 +03:00			`{`
Eventscript functions - no longer require interface_modify.sh Make add_ip_to_iface() and delete_ip_from_iface() do their own locking so the external script is no longer required. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 93f90caf91246074d9359bf31a39b26212cccc42) 2012-03-07 09:18:12 +04:00			`_iface=$1`
			`_ip=$2`
			`_maskbits=$3`
config: add interface_modify.sh and call it under flock to make modification on interfaces atomic When two releaseip events run in parallel it's possible that the 2nd script readds a secondary ip that was removed by the 1st script. metze (This used to be ctdb commit e02417b2a55c45ac2c125b1b3463c9c39e7bc07a) 2010-01-20 13:10:48 +03:00
ctdb-eventscripts: Deleting IPs should use the promote_secondaries option If a primary IP address is being deleted from an interface, the secondaries are remembered and added back after the primary is deleted. This is done under a lock shared by the add/del script code. It is necessary because, by default, Linux deletes secondaries when the corresponding primary is deleted. There is a race here between ctdbd and the scripts, since ctdbd doesn't know about the lock. If ctdbd receives a release IP control and the IP address is not on an interface then it is regarded as a "Redundant release of IP" so no "releaseip" event is generated. This can occur if the IP address in question is a secondary that has been temporarily dropped. It is more likely if the number of secondaries is large. Since Linux 2.6.12 (i.e. 2005) Linux has supported a promote_secondaries option on interfaces. This option is currently undocumented but that will change in Linux 3.14. With promote_secondaries enabled the kernel will not drop secondaries but will promote a corresponding secondary instead. The kernel does all necessary locking. Use promote_secondaries to simplify the code, avoid re-adding secondaries, avoid re-adding routes and provide improved performance. This could be done conditionally, with a fallback to legacy secondary-re-adding code, but no supported Linux distribution is running a pre-2.6.12 kernel so this is unnecessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-01-28 07:41:25 +04:00			`# Ensure interface is up`
			`ip link set "$_iface" up \|\| \`
			`die "Failed to bringup interface $_iface"`
config: add interface_modify.sh and call it under flock to make modification on interfaces atomic When two releaseip events run in parallel it's possible that the 2nd script readds a secondary ip that was removed by the 1st script. metze (This used to be ctdb commit e02417b2a55c45ac2c125b1b3463c9c39e7bc07a) 2010-01-20 13:10:48 +03:00
ctdb-eventscripts: Fix regression in IP add/delete functions Commit 176ae6c704528c021fcc34a41878584f43a00119 caused these functions to exit on failure. This is incorrect and broke NAT gateway. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-03-14 06:14:18 +04:00			`ip addr add "$_ip/$_maskbits" brd + dev "$_iface" \|\| {`
			`echo "Failed to add $_ip/$_maskbits on dev $_iface"`
			`return 1`
			`}`
config: add interface_modify.sh and call it under flock to make modification on interfaces atomic When two releaseip events run in parallel it's possible that the 2nd script readds a secondary ip that was removed by the 1st script. metze (This used to be ctdb commit e02417b2a55c45ac2c125b1b3463c9c39e7bc07a) 2010-01-20 13:10:48 +03:00			`}`

			`delete_ip_from_iface()`
			`{`
Eventscript functions - no longer require interface_modify.sh Make add_ip_to_iface() and delete_ip_from_iface() do their own locking so the external script is no longer required. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 93f90caf91246074d9359bf31a39b26212cccc42) 2012-03-07 09:18:12 +04:00			`_iface=$1`
			`_ip=$2`
			`_maskbits=$3`

ctdb-eventscripts: Deleting IPs should use the promote_secondaries option If a primary IP address is being deleted from an interface, the secondaries are remembered and added back after the primary is deleted. This is done under a lock shared by the add/del script code. It is necessary because, by default, Linux deletes secondaries when the corresponding primary is deleted. There is a race here between ctdbd and the scripts, since ctdbd doesn't know about the lock. If ctdbd receives a release IP control and the IP address is not on an interface then it is regarded as a "Redundant release of IP" so no "releaseip" event is generated. This can occur if the IP address in question is a secondary that has been temporarily dropped. It is more likely if the number of secondaries is large. Since Linux 2.6.12 (i.e. 2005) Linux has supported a promote_secondaries option on interfaces. This option is currently undocumented but that will change in Linux 3.14. With promote_secondaries enabled the kernel will not drop secondaries but will promote a corresponding secondary instead. The kernel does all necessary locking. Use promote_secondaries to simplify the code, avoid re-adding secondaries, avoid re-adding routes and provide improved performance. This could be done conditionally, with a fallback to legacy secondary-re-adding code, but no supported Linux distribution is running a pre-2.6.12 kernel so this is unnecessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-01-28 07:41:25 +04:00			`# This could be set globally for all interfaces but it is probably`
			`# better to avoid surprises, so limit it the interfaces where CTDB`
			`# has public IP addresses. There isn't anywhere else convenient`
			`# to do this so just set it each time. This is much cheaper than`
			`# remembering and re-adding secondaries.`
			`set_proc "sys/net/ipv4/conf/${_iface}/promote_secondaries" 1`
Eventscript functions - no longer require interface_modify.sh Make add_ip_to_iface() and delete_ip_from_iface() do their own locking so the external script is no longer required. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 93f90caf91246074d9359bf31a39b26212cccc42) 2012-03-07 09:18:12 +04:00
ctdb-eventscripts: Fix regression in IP add/delete functions Commit 176ae6c704528c021fcc34a41878584f43a00119 caused these functions to exit on failure. This is incorrect and broke NAT gateway. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-03-14 06:14:18 +04:00			`ip addr del "$_ip/$_maskbits" dev "$_iface" \|\| {`
			`echo "Failed to del $_ip on dev $_iface"`
			`return 1`
			`}`
config: add setup_iface_ip_readd_script() helper function This adds a generic infrastructure to register scripts which will be called when the delete_ip_from_iface() funtion needs to readd secondary ips to an interface. metze (This used to be ctdb commit ac97d65f44e8dc8bf2ec8f68e4db3448521755a2) 2010-02-12 11:48:01 +03:00			`}`

scripts: Make drop_all_public_ips() more robust Incorporate some of the logic from ctdb-crash-cleanup.sh that ensures IPs are deleted even if they have the wrong netmask or are on the wrong interface. Factoring out some of the code will allow it to be used elsewhere. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 03356fd5ae7a3ac35fde0289cbea7c71ecf07367) 2013-01-04 04:23:29 +04:00			`# If the given IP is hosted then print 2 items: maskbits and iface`
			`ip_maskbits_iface ()`
			`{`
			`_addr="$1"`

ctdb-scripts: Add IPv6 addresses support in ip_maskbits_iface() It also prints a third word, the address family. This is either "inet" or "inet6". Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-21 06:37:54 +03:00			`case "$_addr" in`
			`:) _family="inet6" ; _bits=128 ;;`
			`*) _family="inet" ; _bits=32 ;;`
			`esac`

			`ip addr show to "${_addr}/${_bits}" 2>/dev/null \| \`
			`awk -v family="${_family}" \`
			`'NR == 1 { iface = gensub(":$", "", 1, $2) } \`
			`$1 ~ /inet/ { print gensub(".*/", "", 1, $2), iface, family }'`
scripts: Make drop_all_public_ips() more robust Incorporate some of the logic from ctdb-crash-cleanup.sh that ensures IPs are deleted even if they have the wrong netmask or are on the wrong interface. Factoring out some of the code will allow it to be used elsewhere. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 03356fd5ae7a3ac35fde0289cbea7c71ecf07367) 2013-01-04 04:23:29 +04:00			`}`

			`drop_ip ()`
			`{`
			`_addr="${1%/*}" # Remove optional maskbits`

			`set -- $(ip_maskbits_iface $_addr)`
			`if [ -n "$1" ] ; then`
			`_maskbits="$1"`
			`_iface="$2"`
scripts: drop_all_public_ips() now prints messages to stdout, not log Change all callers to maintain current behaviour. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b67397ef5419c781a35916575151da7b7e7cc27) 2013-06-16 14:24:10 +04:00			`echo "Removing public address $_addr/$_maskbits from device $_iface"`
scripts: drop_ip() should use delete_ip_from_iface() Otherwise secondary addresses that aren't owned by CTDB could be dropped. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5ffce65a1ad659b198ddf647622b899bdde45c72) 2013-06-18 08:53:17 +04:00			`delete_ip_from_iface $_iface $_addr $_maskbits >/dev/null 2>&1`
scripts: Make drop_all_public_ips() more robust Incorporate some of the logic from ctdb-crash-cleanup.sh that ensures IPs are deleted even if they have the wrong netmask or are on the wrong interface. Factoring out some of the code will allow it to be used elsewhere. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 03356fd5ae7a3ac35fde0289cbea7c71ecf07367) 2013-01-04 04:23:29 +04:00			`fi`
			`}`

			`drop_all_public_ips ()`
			`{`
			`while read _ip _x ; do`
scripts: drop_all_public_ips() now prints messages to stdout, not log Change all callers to maintain current behaviour. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b67397ef5419c781a35916575151da7b7e7cc27) 2013-06-16 14:24:10 +04:00			`drop_ip "$_ip"`
scripts: Move drop_all_public_ips() to the functions file ... so it can be improved and used elsewhere. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b23c30253cc9eb274b895cac0f8c65245ba0a200) 2013-01-03 08:07:07 +04:00			`done <"${CTDB_PUBLIC_ADDRESSES:-/dev/null}"`
			`}`

40.vsftpd monitor event only fails after 2 failures to connect to port 21. Change the monitor event in 40.vsftpd so it only fails if there are 2 successive failures connecting to port 21. This reduces the likelihood of unhealthy nodes due to vsftpd being restarted for reconfiguration due to node failover or system reconfiguration. New eventscript functions ctdb_counter_init, ctdb_counter_incr, ctdb_counter_limit. These are used to count arbitrary things in eventscripts, depending on the eventscript name and a tag that is passed, and determine if a specified limit has been hit. They're good for counting failures! These functions are used in 40.vsftpd and also in 01.reclock - the latter used to do the counting without these functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896) 2009-09-30 15:05:16 +04:00			`########################################################`
eventscripts: Add modulo (%) operator to ctdb_check_counter() Also add it to the corresponding eventscript unit test infrastructure. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f4ef83a256f59eeb00b9a5bc10c28347e1ad1031) 2013-08-02 09:18:47 +04:00			`# Simple counters`
40.vsftpd monitor event only fails after 2 failures to connect to port 21. Change the monitor event in 40.vsftpd so it only fails if there are 2 successive failures connecting to port 21. This reduces the likelihood of unhealthy nodes due to vsftpd being restarted for reconfiguration due to node failover or system reconfiguration. New eventscript functions ctdb_counter_init, ctdb_counter_incr, ctdb_counter_limit. These are used to count arbitrary things in eventscripts, depending on the eventscript name and a tag that is passed, and determine if a specified limit has been hit. They're good for counting failures! These functions are used in 40.vsftpd and also in 01.reclock - the latter used to do the counting without these functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896) 2009-09-30 15:05:16 +04:00			`_ctdb_counter_common () {`
eventscripts: counters default to $script_name if $service_name not set Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fff88940f71058e4eefd65f50a6701389c005c17) 2013-04-30 09:31:27 +04:00			`_service_name="${1:-${service_name:-${script_name}}}"`
Eventscript functions: add optional event name argument to fail count functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b14f18649f42aab80ce0336c15ab6159f241c9af) 2010-12-15 02:48:00 +03:00			`_counter_file="$ctdb_fail_dir/$_service_name"`
40.vsftpd monitor event only fails after 2 failures to connect to port 21. Change the monitor event in 40.vsftpd so it only fails if there are 2 successive failures connecting to port 21. This reduces the likelihood of unhealthy nodes due to vsftpd being restarted for reconfiguration due to node failover or system reconfiguration. New eventscript functions ctdb_counter_init, ctdb_counter_incr, ctdb_counter_limit. These are used to count arbitrary things in eventscripts, depending on the eventscript name and a tag that is passed, and determine if a specified limit has been hit. They're good for counting failures! These functions are used in 40.vsftpd and also in 01.reclock - the latter used to do the counting without these functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896) 2009-09-30 15:05:16 +04:00			`mkdir -p "${_counter_file%/*}" # dirname`
			`}`
			`ctdb_counter_init () {`
Eventscript functions: fix counter regression. d362be7d32079ac1390d67056ce107bfbca2c937 wasn't well thought out. Subsequent commits depend on ctdb_counter_init() taking an argument, so this makes those cases work. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 05a8fcfbac3da2b5843b31e0fe258255cc761190) 2010-12-16 02:11:33 +03:00			`_ctdb_counter_common "$1"`
40.vsftpd monitor event only fails after 2 failures to connect to port 21. Change the monitor event in 40.vsftpd so it only fails if there are 2 successive failures connecting to port 21. This reduces the likelihood of unhealthy nodes due to vsftpd being restarted for reconfiguration due to node failover or system reconfiguration. New eventscript functions ctdb_counter_init, ctdb_counter_incr, ctdb_counter_limit. These are used to count arbitrary things in eventscripts, depending on the eventscript name and a tag that is passed, and determine if a specified limit has been hit. They're good for counting failures! These functions are used in 40.vsftpd and also in 01.reclock - the latter used to do the counting without these functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896) 2009-09-30 15:05:16 +04:00
Now vaguely tested initscript updates. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f1e350f9edb74cc44b6c5be4c062fd93e98ba8c4) 2009-11-19 08:48:19 +03:00			`>"$_counter_file"`
40.vsftpd monitor event only fails after 2 failures to connect to port 21. Change the monitor event in 40.vsftpd so it only fails if there are 2 successive failures connecting to port 21. This reduces the likelihood of unhealthy nodes due to vsftpd being restarted for reconfiguration due to node failover or system reconfiguration. New eventscript functions ctdb_counter_init, ctdb_counter_incr, ctdb_counter_limit. These are used to count arbitrary things in eventscripts, depending on the eventscript name and a tag that is passed, and determine if a specified limit has been hit. They're good for counting failures! These functions are used in 40.vsftpd and also in 01.reclock - the latter used to do the counting without these functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896) 2009-09-30 15:05:16 +04:00			`}`
			`ctdb_counter_incr () {`
Eventscript functions: fix counter regression. d362be7d32079ac1390d67056ce107bfbca2c937 wasn't well thought out. Subsequent commits depend on ctdb_counter_init() taking an argument, so this makes those cases work. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 05a8fcfbac3da2b5843b31e0fe258255cc761190) 2010-12-16 02:11:33 +03:00			`_ctdb_counter_common "$1"`
40.vsftpd monitor event only fails after 2 failures to connect to port 21. Change the monitor event in 40.vsftpd so it only fails if there are 2 successive failures connecting to port 21. This reduces the likelihood of unhealthy nodes due to vsftpd being restarted for reconfiguration due to node failover or system reconfiguration. New eventscript functions ctdb_counter_init, ctdb_counter_incr, ctdb_counter_limit. These are used to count arbitrary things in eventscripts, depending on the eventscript name and a tag that is passed, and determine if a specified limit has been hit. They're good for counting failures! These functions are used in 40.vsftpd and also in 01.reclock - the latter used to do the counting without these functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896) 2009-09-30 15:05:16 +04:00
			`# unary counting!`
			`echo -n 1 >> "$_counter_file"`
			`}`
Eventscript functions: new function ctdb_check_counter(). This should eventually be able to replace ctdb_check_counter_limit() and ctdb_check_counter_equal(), although it doesn't issue warnings like the former. It takes 4 optional arguments: 1. _msg - If "error" then over limit causes an error message and and exit 1. Anything else fails silently but the function returns 1. Default is "error". 2. _op - An integer operator supported by test (e.g. -eq, -ge, -gt). Default is -ge. 3. _limit - Limit for the counter to be used in comparison. Default is $service_fail_limit. 4. _service_name - Used to identify the counter. Default is $service_name. For example: ctdb_check_counter error -ge 5 foo will print a message and exit 1 if the counter for foo is >= 5, whereas ctdb_check_counter check -ge 5 foo will just return 1 if the counter for foo is >= 5, and ctdb_counter_check with print a message and exit 1 if the counter for $service_name is >= $service_fail_limit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5b01b7233515669e995e037205796e265643b176) 2010-12-17 08:10:56 +03:00			`ctdb_check_counter () {`
			`_msg="${1:-error}" # "error" - anything else is silent on fail`
			`_op="${2:--ge}" # an integer operator supported by test`
			`_limit="${3:-${service_fail_limit}}"`
			`shift 3`
			`_ctdb_counter_common "$1"`

			`# unary counting!`
			`_size=$(stat -c "%s" "$_counter_file" 2>/dev/null \|\| echo 0)`
eventscripts: Add modulo (%) operator to ctdb_check_counter() Also add it to the corresponding eventscript unit test infrastructure. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f4ef83a256f59eeb00b9a5bc10c28347e1ad1031) 2013-08-02 09:18:47 +04:00			`_hit=false`
			`if [ "$_op" != "%" ] ; then`
			`if [ $_size $_op $_limit ] ; then`
			`_hit=true`
			`fi`
			`else`
			`if [ $(($_size $_op $_limit)) -eq 0 ] ; then`
			`_hit=true`
			`fi`
			`fi`
			`if $_hit ; then`
Eventscript functions: new function ctdb_check_counter(). This should eventually be able to replace ctdb_check_counter_limit() and ctdb_check_counter_equal(), although it doesn't issue warnings like the former. It takes 4 optional arguments: 1. _msg - If "error" then over limit causes an error message and and exit 1. Anything else fails silently but the function returns 1. Default is "error". 2. _op - An integer operator supported by test (e.g. -eq, -ge, -gt). Default is -ge. 3. _limit - Limit for the counter to be used in comparison. Default is $service_fail_limit. 4. _service_name - Used to identify the counter. Default is $service_name. For example: ctdb_check_counter error -ge 5 foo will print a message and exit 1 if the counter for foo is >= 5, whereas ctdb_check_counter check -ge 5 foo will just return 1 if the counter for foo is >= 5, and ctdb_counter_check with print a message and exit 1 if the counter for $service_name is >= $service_fail_limit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5b01b7233515669e995e037205796e265643b176) 2010-12-17 08:10:56 +03:00			`if [ "$_msg" = "error" ] ; then`
eventscripts: Improve message logged when a counter hits a limit It should print the actual number of consecutive failures rather than the limit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ff5f0d1e29af2b293e30cdc54bed03a644be7038) 2013-08-08 10:02:44 +04:00			`echo "ERROR: $_size consecutive failures for $_service_name, marking node unhealthy"`
Eventscript functions: new function ctdb_check_counter(). This should eventually be able to replace ctdb_check_counter_limit() and ctdb_check_counter_equal(), although it doesn't issue warnings like the former. It takes 4 optional arguments: 1. _msg - If "error" then over limit causes an error message and and exit 1. Anything else fails silently but the function returns 1. Default is "error". 2. _op - An integer operator supported by test (e.g. -eq, -ge, -gt). Default is -ge. 3. _limit - Limit for the counter to be used in comparison. Default is $service_fail_limit. 4. _service_name - Used to identify the counter. Default is $service_name. For example: ctdb_check_counter error -ge 5 foo will print a message and exit 1 if the counter for foo is >= 5, whereas ctdb_check_counter check -ge 5 foo will just return 1 if the counter for foo is >= 5, and ctdb_counter_check with print a message and exit 1 if the counter for $service_name is >= $service_fail_limit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5b01b7233515669e995e037205796e265643b176) 2010-12-17 08:10:56 +03:00			`exit 1`
			`else`
			`return 1`
			`fi`
			`fi`
			`}`
add a new support function ctdb_check_counter_equal() update nfs to try to restart the service after 10 consecutive failures and to flag the node unhealthy after 15 add similar function to mountd (This used to be ctdb commit 1569a54bb82fc433895ed68f816cf48399ad9d40) 2010-11-17 05:50:56 +03:00
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`########################################################`

ctdb/eventscripts: Move all eventscript state under $CTDB_VARDIR/state Services can be flagged for reconfigure when they release IPs at shutdown. The flag is never removed and the service is prematurely reconfigured during the first "ipreallocated" event, before any IPs are hosted and before the "startup" event has actually started the services. $CTDB_VARDIR/state directly contained the service state subdirectories and is already removed in the "init" event. Just push the service state subdirectories down a level and put everything else in a subdirectory. This way all the eventscript state gets cleaned up every time CTDB starts up. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jan 17 09:58:26 CET 2014 on sn-devel-104 2013-12-18 10:08:55 +04:00			`ctdb_status_dir="$CTDB_VARDIR/state/service_status"`
			`ctdb_fail_dir="$CTDB_VARDIR/state/failcount"`
Eventscript functions: new functions to remember/check if service managed. This was done ad hoc and was badly named. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a084a121f629b2c1bcefc1e4c4a4a5cacf53987) 2010-12-15 02:45:17 +03:00
Eventscript functions: new function ctdb_setup_service_state_dir(). To be used by eventscripts to create a per-service directory for their own state data. $service_state_dir is set to point to the new directory. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a273554791c2a5281aee28f8e2be0c514e14c91e) 2010-12-15 02:49:48 +03:00			`ctdb_setup_service_state_dir ()`
			`{`
ctdb/eventscripts: Move all eventscript state under $CTDB_VARDIR/state Services can be flagged for reconfigure when they release IPs at shutdown. The flag is never removed and the service is prematurely reconfigured during the first "ipreallocated" event, before any IPs are hosted and before the "startup" event has actually started the services. $CTDB_VARDIR/state directly contained the service state subdirectories and is already removed in the "init" event. Just push the service state subdirectories down a level and put everything else in a subdirectory. This way all the eventscript state gets cleaned up every time CTDB starts up. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jan 17 09:58:26 CET 2014 on sn-devel-104 2013-12-18 10:08:55 +04:00			`service_state_dir="$CTDB_VARDIR/state/service_state/${1:-${service_name}}"`
Eventscript functions: new function ctdb_setup_service_state_dir(). To be used by eventscripts to create a per-service directory for their own state data. $service_state_dir is set to point to the new directory. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a273554791c2a5281aee28f8e2be0c514e14c91e) 2010-12-15 02:49:48 +03:00			`mkdir -p "$service_state_dir" \|\| {`
			`echo "Error creating state dir \"$service_state_dir\""`
			`exit 1`
			`}`
			`}`

Eventscript functions: new functions to remember/check if service managed. This was done ad hoc and was badly named. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a084a121f629b2c1bcefc1e4c4a4a5cacf53987) 2010-12-15 02:45:17 +03:00			`########################################################`
			`# Managed status history, for auto-start/stop`

ctdb/eventscripts: Move all eventscript state under $CTDB_VARDIR/state Services can be flagged for reconfigure when they release IPs at shutdown. The flag is never removed and the service is prematurely reconfigured during the first "ipreallocated" event, before any IPs are hosted and before the "startup" event has actually started the services. $CTDB_VARDIR/state directly contained the service state subdirectories and is already removed in the "init" event. Just push the service state subdirectories down a level and put everything else in a subdirectory. This way all the eventscript state gets cleaned up every time CTDB starts up. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jan 17 09:58:26 CET 2014 on sn-devel-104 2013-12-18 10:08:55 +04:00			`ctdb_managed_dir="$CTDB_VARDIR/state/managed_history"`
Eventscript functions: new functions to remember/check if service managed. This was done ad hoc and was badly named. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a084a121f629b2c1bcefc1e4c4a4a5cacf53987) 2010-12-15 02:45:17 +03:00
			`_ctdb_managed_common ()`
			`{`
eventscripts: Simplify handling of $service name in "managed" functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. $service_name is no automatically longer set in the functions file. This means it needs to be explicitly set in 13.per_ip_routing because this script uses ctdb_service_check_reconfigure(). Eventscript unit test infrastructure needs to set $service_name during fake service setup, and policy routing tests need to be updated accordingly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 27aab8783898a50da8c4bc887b512d8f0c0d842c) 2013-04-29 21:32:29 +04:00			`_ctdb_managed_file="$ctdb_managed_dir/$service_name"`
Eventscript functions: new functions to remember/check if service managed. This was done ad hoc and was badly named. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a084a121f629b2c1bcefc1e4c4a4a5cacf53987) 2010-12-15 02:45:17 +03:00			`}`

			`ctdb_service_managed ()`
			`{`
eventscripts: Simplify handling of $service name in "managed" functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. $service_name is no automatically longer set in the functions file. This means it needs to be explicitly set in 13.per_ip_routing because this script uses ctdb_service_check_reconfigure(). Eventscript unit test infrastructure needs to set $service_name during fake service setup, and policy routing tests need to be updated accordingly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 27aab8783898a50da8c4bc887b512d8f0c0d842c) 2013-04-29 21:32:29 +04:00			`_ctdb_managed_common`
Eventscript functions: new functions to remember/check if service managed. This was done ad hoc and was badly named. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a084a121f629b2c1bcefc1e4c4a4a5cacf53987) 2010-12-15 02:45:17 +03:00			`mkdir -p "$ctdb_managed_dir"`
			`touch "$_ctdb_managed_file"`
			`}`

			`ctdb_service_unmanaged ()`
			`{`
eventscripts: Simplify handling of $service name in "managed" functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. $service_name is no automatically longer set in the functions file. This means it needs to be explicitly set in 13.per_ip_routing because this script uses ctdb_service_check_reconfigure(). Eventscript unit test infrastructure needs to set $service_name during fake service setup, and policy routing tests need to be updated accordingly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 27aab8783898a50da8c4bc887b512d8f0c0d842c) 2013-04-29 21:32:29 +04:00			`_ctdb_managed_common`
Eventscript functions: new functions to remember/check if service managed. This was done ad hoc and was badly named. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a084a121f629b2c1bcefc1e4c4a4a5cacf53987) 2010-12-15 02:45:17 +03:00			`rm -f "$_ctdb_managed_file"`
			`}`

			`is_ctdb_previously_managed_service ()`
			`{`
eventscripts: Simplify handling of $service name in "managed" functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. $service_name is no automatically longer set in the functions file. This means it needs to be explicitly set in 13.per_ip_routing because this script uses ctdb_service_check_reconfigure(). Eventscript unit test infrastructure needs to set $service_name during fake service setup, and policy routing tests need to be updated accordingly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 27aab8783898a50da8c4bc887b512d8f0c0d842c) 2013-04-29 21:32:29 +04:00			`_ctdb_managed_common`
Eventscript functions: new functions to remember/check if service managed. This was done ad hoc and was badly named. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a084a121f629b2c1bcefc1e4c4a4a5cacf53987) 2010-12-15 02:45:17 +03:00			`[ -f "$_ctdb_managed_file" ]`
			`}`

			`########################################################`
			`# Check and set status`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00
Event scripts: Respect CTDB_MANAGES_NFS and add function log_status_cat. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5d97c07be13a8209a81dfc8f73e49371949e4dc3) 2009-11-25 08:34:49 +03:00			`log_status_cat ()`
			`{`
ctdb_setstatus in /etc/ctdb/functions was not working correctly because it was called with a wrong parameter list (This used to be ctdb commit e1e285d9f7fa3237dbbacca52a4eb2b264fa5986) 2010-03-10 12:39:31 +03:00			`echo "node is \"$1\", \"${script_name}\" reports problem: $(cat $2)"`
Event scripts: Respect CTDB_MANAGES_NFS and add function log_status_cat. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5d97c07be13a8209a81dfc8f73e49371949e4dc3) 2009-11-25 08:34:49 +03:00			`}`

Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`ctdb_checkstatus ()`
			`{`
Event scripts: use $script_name rather than $service name for status. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 517e9d9b188b18dffc712a8fecddb41540d27b8d) 2009-11-25 08:42:14 +03:00			`if [ -r "$ctdb_status_dir/$script_name/unhealthy" ] ; then`
			`log_status_cat "unhealthy" "$ctdb_status_dir/$script_name/unhealthy"`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`return 1`
Event scripts: use $script_name rather than $service name for status. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 517e9d9b188b18dffc712a8fecddb41540d27b8d) 2009-11-25 08:42:14 +03:00			`elif [ -r "$ctdb_status_dir/$script_name/banned" ] ; then`
			`log_status_cat "banned" "$ctdb_status_dir/$script_name/banned"`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`return 2`
			`else`
			`return 0`
			`fi`
			`}`

			`ctdb_setstatus ()`
			`{`
Event scripts: use $script_name rather than $service name for status. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 517e9d9b188b18dffc712a8fecddb41540d27b8d) 2009-11-25 08:42:14 +03:00			`d="$ctdb_status_dir/$script_name"`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`case "$1" in`
			`unhealthy\|banned)`
			`mkdir -p "$d"`
			`cat "$2" >"$d/$1"`
			`;;`
			`*)`
			`for i in "banned" "unhealthy" ; do`
			`rm -f "$d/$i"`
			`done`
			`;;`
			`esac`
			`}`

Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`##################################################################`
			`# Reconfigure a service on demand`

			`_ctdb_service_reconfigure_common ()`
			`{`
eventscripts: Simplify handling of $service name in reconfigure functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2ea72ff565222f9edab408638bd45dbba6e8ff7) 2013-04-29 20:59:41 +04:00			`_d="$ctdb_status_dir/${service_name}"`
Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`mkdir -p "$_d"`
			`_ctdb_service_reconfigure_flag="$_d/reconfigure"`
			`}`

Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`ctdb_service_needs_reconfigure ()`
			`{`
eventscripts: Simplify handling of $service name in reconfigure functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2ea72ff565222f9edab408638bd45dbba6e8ff7) 2013-04-29 20:59:41 +04:00			`_ctdb_service_reconfigure_common`
Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`[ -e "$_ctdb_service_reconfigure_flag" ]`
40.vsftpd monitor event only fails after 2 failures to connect to port 21. Change the monitor event in 40.vsftpd so it only fails if there are 2 successive failures connecting to port 21. This reduces the likelihood of unhealthy nodes due to vsftpd being restarted for reconfiguration due to node failover or system reconfiguration. New eventscript functions ctdb_counter_init, ctdb_counter_incr, ctdb_counter_limit. These are used to count arbitrary things in eventscripts, depending on the eventscript name and a tag that is passed, and determine if a specified limit has been hit. They're good for counting failures! These functions are used in 40.vsftpd and also in 01.reclock - the latter used to do the counting without these functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896) 2009-09-30 15:05:16 +04:00			`}`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00
			`ctdb_service_set_reconfigure ()`
			`{`
eventscripts: Simplify handling of $service name in reconfigure functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2ea72ff565222f9edab408638bd45dbba6e8ff7) 2013-04-29 20:59:41 +04:00			`_ctdb_service_reconfigure_common`
Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`>"$_ctdb_service_reconfigure_flag"`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`}`

			`ctdb_service_unset_reconfigure ()`
			`{`
eventscripts: Simplify handling of $service name in reconfigure functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2ea72ff565222f9edab408638bd45dbba6e8ff7) 2013-04-29 20:59:41 +04:00			`_ctdb_service_reconfigure_common`
Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`rm -f "$_ctdb_service_reconfigure_flag"`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`}`

			`ctdb_service_reconfigure ()`
			`{`
eventscripts: Simplify handling of $service name in reconfigure functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2ea72ff565222f9edab408638bd45dbba6e8ff7) 2013-04-29 20:59:41 +04:00			`echo "Reconfiguring service \"${service_name}\"..."`
			`ctdb_service_unset_reconfigure`
			`service_reconfigure \|\| return $?`
			`ctdb_counter_init`
Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`}`

Eventscripts: Change the default reconfigure action to do nothing A default action of restarting the service doesn't obey the principle of least surprise. It cause the NFS service to be implicitly reintroduced. This allows no-op functions to be removed from some eventscripts and service restart functions to be added to others. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c75b5e5b4d000f5c7dab403df8238ceed390c1c0) 2012-12-04 08:00:44 +04:00			`# Default service_reconfigure() function does nothing.`
Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`service_reconfigure ()`
			`{`
Eventscripts: Change the default reconfigure action to do nothing A default action of restarting the service doesn't obey the principle of least surprise. It cause the NFS service to be implicitly reintroduced. This allows no-op functions to be removed from some eventscripts and service restart functions to be added to others. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c75b5e5b4d000f5c7dab403df8238ceed390c1c0) 2012-12-04 08:00:44 +04:00			`:`
Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`}`

ctdb/eventscripts: Reconfigure lock should be released quickly Currently the lock is held until the corresponding eventscript completes, since the process still exists. If the regular part of an eventscript hangs then the lock might unnecessarily be held for a long time. The pathological case is when a monitor event gets stuck in D-wait state and the script times out but can't be killed so the lock is still held. This can cause an unwanted monitor replay. Change this so that the lock is released immediately after the reconfiguration is complete. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2013-12-18 06:51:22 +04:00			`ctdb_reconfigure_take_lock ()`
Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`{`
eventscripts: Simplify handling of $service name in reconfigure functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2ea72ff565222f9edab408638bd45dbba6e8ff7) 2013-04-29 20:59:41 +04:00			`_ctdb_service_reconfigure_common`
Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`_lock="${_d}/reconfigure_lock"`
eventscripts: Ensure directories are created Previous commits stopped the top level of the script from creating certain directories but some functions assume that required directories exist. Create those directories instead. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0076cfc4666e5a96eb2c8affb59585b090840e00) 2013-04-22 00:52:49 +04:00			`mkdir -p "${_lock%/*}" # dirname`
Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`touch "$_lock"`

			`(`
			`flock 0`
			`# This is overkill but will work if we need to extend this to`
			`# allow certain events to run multiple times in parallel`
			`# (e.g. takeip) and write multiple PIDs to the file.`
			`read _locker_event`
			`if [ -n "$_locker_event" ] ; then`
			`while read _pid ; do`
			`if [ -n "$_pid" -a "$_pid" != $$ ] && \`
			`kill -0 "$_pid" 2>/dev/null ; then`
			`exit 1`
			`fi`
			`done`
			`fi`

			`printf "%s\n%s\n" "$event_name" $$ >"$_lock"`
			`exit 0`
			`) <"$_lock"`
			`}`

ctdb/eventscripts: Reconfigure lock should be released quickly Currently the lock is held until the corresponding eventscript completes, since the process still exists. If the regular part of an eventscript hangs then the lock might unnecessarily be held for a long time. The pathological case is when a monitor event gets stuck in D-wait state and the script times out but can't be killed so the lock is still held. This can cause an unwanted monitor replay. Change this so that the lock is released immediately after the reconfiguration is complete. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2013-12-18 06:51:22 +04:00			`ctdb_reconfigure_release_lock ()`
			`{`
			`_ctdb_service_reconfigure_common`
			`_lock="${_d}/reconfigure_lock"`

			`rm -f "$_lock"`
			`}`

Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`ctdb_replay_monitor_status ()`
			`{`
			`echo "Replaying previous status for this script due to reconfigure..."`
ctdb-scripts: Update eventscripts to use ctdb -X instead of ctdb -Y Also update associated eventscript unit tests and ctdb stub. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-20 06:32:46 +03:00			`# Leading separator ('\|') is missing in some versions...`
			`_out=$(ctdb scriptstatus -X \| grep -E "^\\|?monitor\\|${script_name}\\|")`
Eventscripts - enhance ctdb_replay_monitor_status() Print useful output and return a suitable exit code. The DISABLED and TIMEDOUT statuses use fake negative return codes, and these can't be faked from the shell. So we map DISABLED to OK and TIMEDOUT to ERROR - this should avoid nearly all surprises. When we do this we add a note to the beginning of the output. The alternative is to "fix" ctdbd to use only codes that can actually be returned by shell scripts. However, the reason for using negative codes is probably to distinguish them from real ones... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dda44d026e0c1b02feb02185b8c200a542be341a) 2011-08-31 09:34:43 +04:00			`# Output looks like this:`
ctdb-scripts: Update eventscripts to use ctdb -X instead of ctdb -Y Also update associated eventscript unit tests and ctdb stub. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-20 06:32:46 +03:00			`# \|monitor\|60.nfs\|1\|ERROR\|1314764004.030861\|1314764004.035514\|foo bar\|`
Eventscripts - enhance ctdb_replay_monitor_status() Print useful output and return a suitable exit code. The DISABLED and TIMEDOUT statuses use fake negative return codes, and these can't be faked from the shell. So we map DISABLED to OK and TIMEDOUT to ERROR - this should avoid nearly all surprises. When we do this we add a note to the beginning of the output. The alternative is to "fix" ctdbd to use only codes that can actually be returned by shell scripts. However, the reason for using negative codes is probably to distinguish them from real ones... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dda44d026e0c1b02feb02185b8c200a542be341a) 2011-08-31 09:34:43 +04:00			`# This is the cheapest way of getting fields in the middle.`
ctdb-scripts: Update eventscripts to use ctdb -X instead of ctdb -Y Also update associated eventscript unit tests and ctdb stub. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-20 06:32:46 +03:00			`set -- $(IFS="\|" ; echo $_out)`
Eventscripts - enhance ctdb_replay_monitor_status() Print useful output and return a suitable exit code. The DISABLED and TIMEDOUT statuses use fake negative return codes, and these can't be faked from the shell. So we map DISABLED to OK and TIMEDOUT to ERROR - this should avoid nearly all surprises. When we do this we add a note to the beginning of the output. The alternative is to "fix" ctdbd to use only codes that can actually be returned by shell scripts. However, the reason for using negative codes is probably to distinguish them from real ones... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dda44d026e0c1b02feb02185b8c200a542be341a) 2011-08-31 09:34:43 +04:00			`_code="$3"`
			`_status="$4"`
			`# The error output field can include colons so we'll try to`
			`# preserve them. The weak checking at the beginning tries to make`
ctdb-scripts: Update eventscripts to use ctdb -X instead of ctdb -Y Also update associated eventscript unit tests and ctdb stub. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-20 06:32:46 +03:00			`# this work for both broken (no leading '\|') and fixed output.`
			`_out="${_out%\|}"`
			`_err_out="${_out#monitor\|${script_name}\|\|\|\|*\|}"`
Eventscripts - enhance ctdb_replay_monitor_status() Print useful output and return a suitable exit code. The DISABLED and TIMEDOUT statuses use fake negative return codes, and these can't be faked from the shell. So we map DISABLED to OK and TIMEDOUT to ERROR - this should avoid nearly all surprises. When we do this we add a note to the beginning of the output. The alternative is to "fix" ctdbd to use only codes that can actually be returned by shell scripts. However, the reason for using negative codes is probably to distinguish them from real ones... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dda44d026e0c1b02feb02185b8c200a542be341a) 2011-08-31 09:34:43 +04:00			`case "$_status" in`
			`OK) : ;; # Do nothing special.`
			`TIMEDOUT)`
			`# Recast this as an error, since we can't exit with the`
			`# correct negative number.`
			`_code=1`
			`_err_out="[Replay of TIMEDOUT scriptstatus - note incorrect return code.] ${_err_out}"`
			`;;`
			`DISABLED)`
			`# Recast this as an OK, since we can't exit with the`
			`# correct negative number.`
			`_code=0`
			`_err_out="[Replay of DISABLED scriptstatus - note incorrect return code.] ${_err_out}"`
			`;;`
			`*) : ;; # Must be ERROR, do nothing special.`
			`esac`
eventscripts: When replaying monitor status, don't log empty output Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ce04f1c107b4392ca955d9f29b93aaaae62439ce) 2013-06-24 13:03:26 +04:00			`if [ -n "$_err_out" ] ; then`
			`echo "$_err_out"`
			`fi`
Eventscripts - enhance ctdb_replay_monitor_status() Print useful output and return a suitable exit code. The DISABLED and TIMEDOUT statuses use fake negative return codes, and these can't be faked from the shell. So we map DISABLED to OK and TIMEDOUT to ERROR - this should avoid nearly all surprises. When we do this we add a note to the beginning of the output. The alternative is to "fix" ctdbd to use only codes that can actually be returned by shell scripts. However, the reason for using negative codes is probably to distinguish them from real ones... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dda44d026e0c1b02feb02185b8c200a542be341a) 2011-08-31 09:34:43 +04:00			`exit $_code`
Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`}`

Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`ctdb_service_check_reconfigure ()`
			`{`
eventscripts: Assert that $service_name is set in a few key places Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d0a7d83ddc824961d876fc9afba829c90aef3e7) 2013-04-29 21:45:21 +04:00			`assert_service_name`

Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`# We only care about some events in this function. For others we`
			`# return now.`
Evenscripts: improvements to ctdb_service_check_reconfigure(). * Make this function applicable to "ipreallocated" event too. * Monitor event should not always succeed just because we reconfigure. If the service was unhealthy before the reconfigure and we end the reconfigure with "exit 0" then we can cause the node's health status to flip-flop. To avoid this we return the status of the service from the previous monitor event. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 21dfcbbdccd906fcd6ab7bba81418ce565bf63aa) 2011-01-14 01:31:56 +03:00			`case "$event_name" in`
Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`monitor\|ipreallocated\|reconfigure) : ;;`
			`*) return 0 ;;`
Evenscripts: improvements to ctdb_service_check_reconfigure(). * Make this function applicable to "ipreallocated" event too. * Monitor event should not always succeed just because we reconfigure. If the service was unhealthy before the reconfigure and we end the reconfigure with "exit 0" then we can cause the node's health status to flip-flop. To avoid this we return the status of the service from the previous monitor event. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 21dfcbbdccd906fcd6ab7bba81418ce565bf63aa) 2011-01-14 01:31:56 +03:00			`esac`
Eventscript functions: ctdb_service_check-reconfigure() acts only on monitor. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit beabf506a5eb68fc50fdbf8772c1d2bb0f7951e3) 2010-12-16 01:50:44 +03:00
ctdb/eventscripts: Reconfigure lock should be released quickly Currently the lock is held until the corresponding eventscript completes, since the process still exists. If the regular part of an eventscript hangs then the lock might unnecessarily be held for a long time. The pathological case is when a monitor event gets stuck in D-wait state and the script times out but can't be killed so the lock is still held. This can cause an unwanted monitor replay. Change this so that the lock is released immediately after the reconfiguration is complete. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2013-12-18 06:51:22 +04:00			`if ctdb_reconfigure_take_lock ; then`
Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`# No events covered by this function are running, so proceed`
			`# with gay abandon.`
			`case "$event_name" in`
			`reconfigure)`
eventscripts: Simplify handling of $service name in reconfigure functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2ea72ff565222f9edab408638bd45dbba6e8ff7) 2013-04-29 20:59:41 +04:00			`(ctdb_service_reconfigure)`
Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`exit $?`
			`;;`
			`ipreallocated)`
eventscripts: Simplify handling of $service name in reconfigure functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2ea72ff565222f9edab408638bd45dbba6e8ff7) 2013-04-29 20:59:41 +04:00			`if ctdb_service_needs_reconfigure ; then`
			`ctdb_service_reconfigure`
Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`fi`
			`;;`
			`esac`
ctdb/eventscripts: Reconfigure lock should be released quickly Currently the lock is held until the corresponding eventscript completes, since the process still exists. If the regular part of an eventscript hangs then the lock might unnecessarily be held for a long time. The pathological case is when a monitor event gets stuck in D-wait state and the script times out but can't be killed so the lock is still held. This can cause an unwanted monitor replay. Change this so that the lock is released immediately after the reconfiguration is complete. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2013-12-18 06:51:22 +04:00
			`ctdb_reconfigure_release_lock`
Eventscripts: add a synchronous synthetic reconfigure event. In the current code services can only be reconfigured asynchronously. This means that configuration file changes can be made, an asychronous reconfigure event can be triggered, and it always succeeds. Some time later when a service is actually reconfigured then a failure may be seen This adds a synthetic reconfigure event that reconfigures a service synchronously so that any failure is reported on exit. ctdb_service_check_reconfigure() is essentially reimplemented. If a reconfigure event is in flight and an ipreallocated or monitor event occurs then any scheduled asynchronous reconfigure is deferred until the next monitor cycle. This is to avoid reconfigures trampling on each other. In this case a monitor event will also replay the previous status to try to avoid exposing any temporary instability. If a reconfigure event collides with another reconfigure event it will exit with status 2, indicating that the reconfigure should be retried. The reconfigure event is implemented using a subprocess to control the exit from the synthetic event. As before, if a monitor event causes a scheduled synchronous reconfigure to occure then it will replay the previous status for the service, given that a reconfigure can cause temporary instability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 220578bfd3507152b29ba4c28942f9d5e8733886) 2011-05-16 08:23:28 +04:00			`else`
			`# Somebody else is running an event we don't want to collide`
			`# with. We proceed with caution.`
			`case "$event_name" in`
			`reconfigure)`
			`# Tell whoever called us to retry.`
			`exit 2`
			`;;`
			`ipreallocated)`
			`# Defer any scheduled reconfigure and just run the`
			`# rest of the ipreallocated event, as per the`
			`# eventscript. There's an assumption here that the`
			`# event doesn't depend on any scheduled reconfigure.`
			`# This is true in the current code.`
			`return 0`
			`;;`
			`monitor)`
			`# There is most likely a reconfigure in progress so`
			`# the service is possibly unstable. As above, we`
			`# defer any scheduled reconfigured. We also replay`
			`# the previous monitor status since that's the best`
			`# information we have.`
			`ctdb_replay_monitor_status`
			`;;`
			`esac`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`fi`
			`}`

Eventscripts: rejig the reconfigure infrastructure. * Add an optional service name argument to existing reconfigure functions. * User function service_reconfigure() instead of variable $service_reconfigure to specify how a service is reconfigured. * New function ctdb_service_check_reconfigure() reconfigures a service if it is flagged for reconfigure. * Remove $service_reconfigure settings from 40.vsftpd and 41.httpd - they're the defaults. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c) 2010-12-15 11:19:21 +03:00			`##################################################################`
			`# Does CTDB manage this service? - and associated auto-start/stop`

Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`ctdb_compat_managed_service ()`
			`{`
eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449) 2013-04-29 21:13:36 +04:00			`if [ "$1" = "yes" -a "$2" = "$service_name" ] ; then`
Eventscript functions - optimise is_ctdb_managed_service(). This function generates a lot of trace when running under "set -x". This is due to the backward compatibility code. This adds 3 optimisations: 1. Before invoking the backward compatiblity code, is_ctdb_managed_service() returns early if the service is listed in $CTDB_MANAGED_SERVICES. 2. ctdb_compat_managed_service() actually now updates $CTDB_MANAGED_SERVICES instead of temporary variable $t. This means that a subsequent call to is_ctdb_managed_service() will short circuit due to optimisation (1). 3. ctdb_compat_managed_service() only adds a service to $CTDB_MANAGED_SERVICES if it is the service being checked by is_ctdb_managed_service(). This stops irrelevant services being added to $CTDB_MANAGED_SERVICES multiple times by multiple calls to is_ctdb_managed_service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 758f4667c60089e09a0439c1eb74f5e426ca5e2e) 2010-11-18 08:19:45 +03:00			`CTDB_MANAGED_SERVICES="$CTDB_MANAGED_SERVICES $2"`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`fi`
			`}`

			`is_ctdb_managed_service ()`
			`{`
eventscripts: Assert that $service_name is set in a few key places Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d0a7d83ddc824961d876fc9afba829c90aef3e7) 2013-04-29 21:45:21 +04:00			`assert_service_name`

Eventscript functions - optimise is_ctdb_managed_service(). This function generates a lot of trace when running under "set -x". This is due to the backward compatibility code. This adds 3 optimisations: 1. Before invoking the backward compatiblity code, is_ctdb_managed_service() returns early if the service is listed in $CTDB_MANAGED_SERVICES. 2. ctdb_compat_managed_service() actually now updates $CTDB_MANAGED_SERVICES instead of temporary variable $t. This means that a subsequent call to is_ctdb_managed_service() will short circuit due to optimisation (1). 3. ctdb_compat_managed_service() only adds a service to $CTDB_MANAGED_SERVICES if it is the service being checked by is_ctdb_managed_service(). This stops irrelevant services being added to $CTDB_MANAGED_SERVICES multiple times by multiple calls to is_ctdb_managed_service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 758f4667c60089e09a0439c1eb74f5e426ca5e2e) 2010-11-18 08:19:45 +03:00			`# $t is used just for readability and to allow better accurate`
			`# matching via leading/trailing spaces`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`t=" $CTDB_MANAGED_SERVICES "`

eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449) 2013-04-29 21:13:36 +04:00			`# Return 0 if "<space>$service_name<space>" appears in $t`
			`if [ "${t#* ${service_name} }" != "${t}" ] ; then`
Eventscript functions - optimise is_ctdb_managed_service(). This function generates a lot of trace when running under "set -x". This is due to the backward compatibility code. This adds 3 optimisations: 1. Before invoking the backward compatiblity code, is_ctdb_managed_service() returns early if the service is listed in $CTDB_MANAGED_SERVICES. 2. ctdb_compat_managed_service() actually now updates $CTDB_MANAGED_SERVICES instead of temporary variable $t. This means that a subsequent call to is_ctdb_managed_service() will short circuit due to optimisation (1). 3. ctdb_compat_managed_service() only adds a service to $CTDB_MANAGED_SERVICES if it is the service being checked by is_ctdb_managed_service(). This stops irrelevant services being added to $CTDB_MANAGED_SERVICES multiple times by multiple calls to is_ctdb_managed_service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 758f4667c60089e09a0439c1eb74f5e426ca5e2e) 2010-11-18 08:19:45 +03:00			`return 0`
			`fi`

			`# If above didn't match then update $CTDB_MANAGED_SERVICES for`
			`# backward compatibility and try again.`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`ctdb_compat_managed_service "$CTDB_MANAGES_VSFTPD" "vsftpd"`
			`ctdb_compat_managed_service "$CTDB_MANAGES_SAMBA" "samba"`
update autostart/stop to work for samba (This used to be ctdb commit 37ab57e2adaecc3f7996ea20af45a5df0cd8be76) 2010-11-18 07:40:19 +03:00			`ctdb_compat_managed_service "$CTDB_MANAGES_WINBIND" "winbind"`
apache's service name is not always httpd Solution 2 of <https://bugzilla.samba.org/show_bug.cgi?id=8317> (This used to be ctdb commit 8b9ac5cd8d867ff4866ac464c570d9293d03a91e) 2011-07-26 01:35:49 +04:00			`ctdb_compat_managed_service "$CTDB_MANAGES_HTTPD" "apache2"`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`ctdb_compat_managed_service "$CTDB_MANAGES_HTTPD" "httpd"`
			`ctdb_compat_managed_service "$CTDB_MANAGES_ISCSI" "iscsi"`
			`ctdb_compat_managed_service "$CTDB_MANAGES_CLAMD" "clamd"`
Event scripts: Respect CTDB_MANAGES_NFS and add function log_status_cat. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5d97c07be13a8209a81dfc8f73e49371949e4dc3) 2009-11-25 08:34:49 +03:00			`ctdb_compat_managed_service "$CTDB_MANAGES_NFS" "nfs"`
add a missing part of the import of the previous ganesha patch (This used to be ctdb commit 171b8855bb2feae7f7dd6a079571f3113dedd6f4) 2010-12-06 03:26:43 +03:00			`ctdb_compat_managed_service "$CTDB_MANAGES_NFS" "nfs-ganesha-gpfs"`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00
Eventscript functions - optimise is_ctdb_managed_service(). This function generates a lot of trace when running under "set -x". This is due to the backward compatibility code. This adds 3 optimisations: 1. Before invoking the backward compatiblity code, is_ctdb_managed_service() returns early if the service is listed in $CTDB_MANAGED_SERVICES. 2. ctdb_compat_managed_service() actually now updates $CTDB_MANAGED_SERVICES instead of temporary variable $t. This means that a subsequent call to is_ctdb_managed_service() will short circuit due to optimisation (1). 3. ctdb_compat_managed_service() only adds a service to $CTDB_MANAGED_SERVICES if it is the service being checked by is_ctdb_managed_service(). This stops irrelevant services being added to $CTDB_MANAGED_SERVICES multiple times by multiple calls to is_ctdb_managed_service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 758f4667c60089e09a0439c1eb74f5e426ca5e2e) 2010-11-18 08:19:45 +03:00			`t=" $CTDB_MANAGED_SERVICES "`

eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449) 2013-04-29 21:13:36 +04:00			`# Return 0 if "<space>$service_name<space>" appears in $t`
			`[ "${t#* ${service_name} }" != "${t}" ]`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`}`

			`ctdb_start_stop_service ()`
			`{`
eventscripts: Assert that $service_name is set in a few key places Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d0a7d83ddc824961d876fc9afba829c90aef3e7) 2013-04-29 21:45:21 +04:00			`assert_service_name`

Eventscripts: Add service-start and service-stop pseudo-events Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit be4ad110ede9981b181ac28f31ffd855a879d5df) 2012-08-21 09:52:03 +04:00			`# Allow service-start/service-stop pseudo-events to start/stop`
			`# services when we're not auto-starting/stopping and we're not`
			`# monitoring.`
			`case "$event_name" in`
			`service-start)`
eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449) 2013-04-29 21:13:36 +04:00			`if is_ctdb_managed_service ; then`
Eventscripts: Add service-start and service-stop pseudo-events Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit be4ad110ede9981b181ac28f31ffd855a879d5df) 2012-08-21 09:52:03 +04:00			`die 'service-start event not permitted when service is managed'`
			`fi`
			`if [ "$CTDB_SERVICE_AUTOSTARTSTOP" = "yes" ] ; then`
			`die 'service-start event not permitted with $CTDB_SERVICE_AUTOSTARTSTOP = yes'`
			`fi`
eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449) 2013-04-29 21:13:36 +04:00			`ctdb_service_start`
Eventscripts: Add service-start and service-stop pseudo-events Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit be4ad110ede9981b181ac28f31ffd855a879d5df) 2012-08-21 09:52:03 +04:00			`exit $?`
			`;;`
			`service-stop)`
eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449) 2013-04-29 21:13:36 +04:00			`if is_ctdb_managed_service ; then`
Eventscripts: Add service-start and service-stop pseudo-events Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit be4ad110ede9981b181ac28f31ffd855a879d5df) 2012-08-21 09:52:03 +04:00			`die 'service-stop event not permitted when service is managed'`
			`fi`
			`if [ "$CTDB_SERVICE_AUTOSTARTSTOP" = "yes" ] ; then`
			`die 'service-stop event not permitted with $CTDB_SERVICE_AUTOSTARTSTOP = yes'`
			`fi`
eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449) 2013-04-29 21:13:36 +04:00			`ctdb_service_stop`
Eventscripts: Add service-start and service-stop pseudo-events Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit be4ad110ede9981b181ac28f31ffd855a879d5df) 2012-08-21 09:52:03 +04:00			`exit $?`
			`;;`
			`esac`

Eventscripts: New configuration variable CTDB_SERVICE_AUTOSTARTSTOP. Some of the current auto-start/stop logic is broken, particularly for Samba. Fixing it is non-trivial. If $CTDB_SERVICE_AUTOSTARTSTOP is "yes" then auto-start/stop services when told to newly manage or no longer manage them. This defaults to "yes". However, if using a canned configuration file that doesn't set $CTDB_SERVICE_AUTOSTARTSTOP then this stops the auto-start-stop logic from working. Therefore, this works around CQ S1026685 - on the system in question another daemon controls service auto-start/stop and CTDB just gets in the way. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ef71b8290ae49117d7bcc7166598b77cb64cc8a0) 2011-08-08 07:13:59 +04:00			`# Do nothing unless configured to...`
			`[ "$CTDB_SERVICE_AUTOSTARTSTOP" = "yes" ] \|\| return 0`

Eventscripts: only autostart during a monitor event. Otherwise we might short-circuit events that are run only once and actually need to do something. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c4f9e8a43540bc049b2771e0a2d76d37b9d17331) 2011-01-11 09:12:03 +03:00			`[ "$event_name" = "monitor" ] \|\| return 0`

eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449) 2013-04-29 21:13:36 +04:00			`if is_ctdb_managed_service ; then`
			`if ! is_ctdb_previously_managed_service ; then`
			`echo "Starting service \"$service_name\" - now managed"`
			`background_with_logging ctdb_service_start`
Eventscript functions: move flagging of managed services. Move flagging of managed or unmanaged services into ctdb_service_start() and ctdb_service_stop(). That way services will be correctly flagged if they are started from the startup and shutdown events. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8675744cbd90b5a5095ed6fff7b36ae82004a457) 2010-12-15 08:34:00 +03:00			`exit $?`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`fi`
update autostart/stop to work for samba (This used to be ctdb commit 37ab57e2adaecc3f7996ea20af45a5df0cd8be76) 2010-11-18 07:40:19 +03:00			`else`
eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449) 2013-04-29 21:13:36 +04:00			`if is_ctdb_previously_managed_service ; then`
			`echo "Stopping service \"$service_name\" - no longer managed"`
			`background_with_logging ctdb_service_stop`
Eventscript functions: move flagging of managed services. Move flagging of managed or unmanaged services into ctdb_service_start() and ctdb_service_stop(). That way services will be correctly flagged if they are started from the startup and shutdown events. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8675744cbd90b5a5095ed6fff7b36ae82004a457) 2010-12-15 08:34:00 +03:00			`exit $?`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`fi`
			`fi`
			`}`

			`ctdb_service_start ()`
			`{`
Eventscript functions: move flagging of managed services. Move flagging of managed or unmanaged services into ctdb_service_start() and ctdb_service_stop(). That way services will be correctly flagged if they are started from the startup and shutdown events. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8675744cbd90b5a5095ed6fff7b36ae82004a457) 2010-12-15 08:34:00 +03:00			`# The service is marked managed if we've ever tried to start it.`
eventscripts: Simplify handling of $service name in start/stop functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b5802c4735e1c719a5cf9ce69489d5947bd5e8c5) 2013-04-29 21:18:01 +04:00			`ctdb_service_managed`
Eventscript functions: move flagging of managed services. Move flagging of managed or unmanaged services into ctdb_service_start() and ctdb_service_stop(). That way services will be correctly flagged if they are started from the startup and shutdown events. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8675744cbd90b5a5095ed6fff7b36ae82004a457) 2010-12-15 08:34:00 +03:00
eventscripts: Simplify handling of $service name in start/stop functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b5802c4735e1c719a5cf9ce69489d5947bd5e8c5) 2013-04-29 21:18:01 +04:00			`service_start \|\| return $?`
Eventscript function: change service_start into a function. service_start is currently a variable. This makes passing arguments hard. We change it to be a function and put default definitions into the functions file. We use a convention that if a service name argument is passed to a redefined version of service_start() or service_stop() then it will act unconditionally. If no argument is passed then it can use internal logic to decide if services should really be started. This is useful when a single eventscript handles multiple services. This is a cherry-pick of ae38895 that needed to be reset mid-stream. There is still some breakage following this commit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 86e4aefed9fd1028660c98e3ea758c2b75ffc1d8) 2011-08-11 03:39:25 +04:00
eventscripts: Simplify handling of $service name in start/stop functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b5802c4735e1c719a5cf9ce69489d5947bd5e8c5) 2013-04-29 21:18:01 +04:00			`ctdb_counter_init`
Eventscripts - weaken TCP port check message if CTDB has just been started. Sometimes smbd and other services can take a while to start, especially when there is a lot of activity after ctdbd has just started. The TCP port check can then pollute the logs with lots of "ERROR" messages and possibly extra debug. This creates a flag file when a service is started (but not restarted) and this flag is removed the first time that TCP port checks succeed for that service. When a port check fails and the flag file still exists, a less extreme "INFO" message is printed rather than the usual "ERROR" message. This means that until the node actually becomes healthy we see more friendly messages. The subtext is that we're hearing false positive reports "recreates" of CQ S1024874 (samba stopped responding on port 445) quite often when ctdbd is started. This reduces the chances of people reporting such false recreates... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 571865eb6ef847857129d0b1e2ba5fa7254bfe8c) 2011-08-05 10:39:57 +04:00			`ctdb_check_tcp_init`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`}`

			`ctdb_service_stop ()`
			`{`
eventscripts: Simplify handling of $service name in start/stop functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b5802c4735e1c719a5cf9ce69489d5947bd5e8c5) 2013-04-29 21:18:01 +04:00			`ctdb_service_unmanaged`
			`service_stop`
Eventscript function: change service_start into a function. service_start is currently a variable. This makes passing arguments hard. We change it to be a function and put default definitions into the functions file. We use a convention that if a service name argument is passed to a redefined version of service_start() or service_stop() then it will act unconditionally. If no argument is passed then it can use internal logic to decide if services should really be started. This is useful when a single eventscript handles multiple services. This is a cherry-pick of ae38895 that needed to be reset mid-stream. There is still some breakage following this commit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 86e4aefed9fd1028660c98e3ea758c2b75ffc1d8) 2011-08-11 03:39:25 +04:00			`}`

			`# Default service_start() and service_stop() functions.`

ctdb-scripts: Update a misleading comment This comment was true when 50.samba was spaghetti because it tried to automatically manage both smbd (and nmbd) and winbind. It isn't true anymore. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Feb 19 04:07:12 CET 2014 on sn-devel-104 2014-02-14 09:59:08 +04:00			`# These may be overridden in an eventscript.`
Eventscript function: change service_start into a function. service_start is currently a variable. This makes passing arguments hard. We change it to be a function and put default definitions into the functions file. We use a convention that if a service name argument is passed to a redefined version of service_start() or service_stop() then it will act unconditionally. If no argument is passed then it can use internal logic to decide if services should really be started. This is useful when a single eventscript handles multiple services. This is a cherry-pick of ae38895 that needed to be reset mid-stream. There is still some breakage following this commit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 86e4aefed9fd1028660c98e3ea758c2b75ffc1d8) 2011-08-11 03:39:25 +04:00			`service_start ()`
			`{`
eventscripts: Simplify handling of $service name in start/stop functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b5802c4735e1c719a5cf9ce69489d5947bd5e8c5) 2013-04-29 21:18:01 +04:00			`service "$service_name" start`
Eventscript function: change service_start into a function. service_start is currently a variable. This makes passing arguments hard. We change it to be a function and put default definitions into the functions file. We use a convention that if a service name argument is passed to a redefined version of service_start() or service_stop() then it will act unconditionally. If no argument is passed then it can use internal logic to decide if services should really be started. This is useful when a single eventscript handles multiple services. This is a cherry-pick of ae38895 that needed to be reset mid-stream. There is still some breakage following this commit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 86e4aefed9fd1028660c98e3ea758c2b75ffc1d8) 2011-08-11 03:39:25 +04:00			`}`

			`service_stop ()`
			`{`
eventscripts: Simplify handling of $service name in start/stop functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b5802c4735e1c719a5cf9ce69489d5947bd5e8c5) 2013-04-29 21:18:01 +04:00			`service "$service_name" stop`
Eventscripts: Untested factorisations and introduction of status event. This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1) 2009-11-13 10:28:25 +03:00			`}`

Eventscript function: change service_start into a function. service_start is currently a variable. This makes passing arguments hard. We change it to be a function and put default definitions into the functions file. We use a convention that if a service name argument is passed to a redefined version of service_start() or service_stop() then it will act unconditionally. If no argument is passed then it can use internal logic to decide if services should really be started. This is useful when a single eventscript handles multiple services. This is a cherry-pick of ae38895 that needed to be reset mid-stream. There is still some breakage following this commit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 86e4aefed9fd1028660c98e3ea758c2b75ffc1d8) 2011-08-11 03:39:25 +04:00			`##################################################################`

Eventscript argument cleanups and introduction of ctdb_standard_event_handler. The functions file no longer causes a side-effect by doing a shift. It also doesn't set a convenience variable for $1. All eventscripts now explicitly use "$1" in their case statement, as does the initscript. The absence of a shift means that the takeip/releaseip events now explicitly reference $2-$4 rather than $1-$3. New function ctdb_standard_event_handler handles the status and setstatus events, and exits for either of those events. It is called via a default case in each eventscript, replacing an explicit status case where applicable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d55408cbbb3bb71670b80f3dad5639ea0be5b5b) 2009-12-01 09:43:47 +03:00			`ctdb_standard_event_handler ()`
			`{`
			`case "$1" in`
			`status)`
			`ctdb_checkstatus`
			`exit`
			`;;`
			`setstatus)`
ctdb_setstatus in /etc/ctdb/functions was not working correctly because it was called with a wrong parameter list (This used to be ctdb commit e1e285d9f7fa3237dbbacca52a4eb2b264fa5986) 2010-03-10 12:39:31 +03:00			`shift`
Eventscript argument cleanups and introduction of ctdb_standard_event_handler. The functions file no longer causes a side-effect by doing a shift. It also doesn't set a convenience variable for $1. All eventscripts now explicitly use "$1" in their case statement, as does the initscript. The absence of a shift means that the takeip/releaseip events now explicitly reference $2-$4 rather than $1-$3. New function ctdb_standard_event_handler handles the status and setstatus events, and exits for either of those events. It is called via a default case in each eventscript, replacing an explicit status case where applicable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d55408cbbb3bb71670b80f3dad5639ea0be5b5b) 2009-12-01 09:43:47 +03:00			`ctdb_setstatus "$@"`
			`exit`
			`;;`
			`esac`
			`}`

config: wrap iptables in flock to avoid concurrancy. When doing a releaseip event, we do them in parallel for all the separate IPs. This creates a problem for iptables, which isn't reentrant, giving the strange message: iptables encountered unknown error "18446744073709551615" while initializing table "filter" The worst possible symptom of this is that releaseip won't remove the rule which prevents us listening to clients during releaseip, and the node will be healthy but non-responsive. The simple workaround is to flock-wrap iptables. Better would be to rework the code so we didn't need to use iptables in these paths. CQ:S1018353 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 72d6914ee913272312d7b68f1be5ad05ad06587d) 2010-07-12 09:41:42 +04:00			`# iptables doesn't like being re-entered, so flock-wrap it.`
			`iptables()`
			`{`
Eventscripts: iptables() should put lock in $CTDB_VARDIR. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3f04793f391c63b78ffb9c9851ab3f0daf3ed50a) 2011-06-28 09:04:58 +04:00			`flock -w 30 $CTDB_VARDIR/iptables-ctdb.flock /sbin/iptables "$@"`
config: wrap iptables in flock to avoid concurrancy. When doing a releaseip event, we do them in parallel for all the separate IPs. This creates a problem for iptables, which isn't reentrant, giving the strange message: iptables encountered unknown error "18446744073709551615" while initializing table "filter" The worst possible symptom of this is that releaseip won't remove the rule which prevents us listening to clients during releaseip, and the node will be healthy but non-responsive. The simple workaround is to flock-wrap iptables. Better would be to rework the code so we didn't need to use iptables in these paths. CQ:S1018353 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 72d6914ee913272312d7b68f1be5ad05ad06587d) 2010-07-12 09:41:42 +04:00			`}`

scripts: Provide mktemp function for platforms without mktemp command This is needed for AIX and possibly others. Also provide a cheaper mktemp function is needed in the run_tests script. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b2b572e9049c7138bd223226475bef8fe3e01f10) 2013-05-25 13:57:24 +04:00			`# AIX (and perhaps others?) doesn't have mktemp`
			`if ! which mktemp >/dev/null 2>&1 ; then`
			`mktemp ()`
			`{`
			`_dir=false`
			`if [ "$1" = "-d" ] ; then`
			`_dir=true`
			`shift`
			`fi`
			`_d="${TMPDIR:-/tmp}"`
			`_hex10=$(dd if=/dev/urandom count=20 2>/dev/null \| \`
			`md5sum \| \`
			`sed -e 's@$..........$.*@\1@')`
			`_t="${_d}/tmp.${_hex10}"`
			`(`
			`umask 077`
			`if $_dir ; then`
			`mkdir "$_t"`
			`else`
			`>"$_t"`
			`fi`
			`)`
			`echo "$_t"`
			`}`
			`fi`

NFS tickles: use addtickle/deltickle instead of shared tickle directory. This adds a new function update_tickles() that tracks tickles for a given port using the new ctdb addtickle/deltickle commands. This function is used in events.d/60.nfs to handle NFS tickles. events.d/61.nfstickle is removed. The /proc/sys/net/ipv4/tcp_tw_recycle setup is also moved to events.d/60.nfs. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dca4c4ebf3c35f8db3ae208efb7a83abbf726ed6) 2010-08-26 08:59:59 +04:00			`########################################################`
			`# tickle handling`
			`########################################################`

			`update_tickles ()`
			`{`
			`_port="$1"`

scripts: Clean up update_tickles() and handling of associated directory Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 700cf95a1f29b4b88460a00a55d57a9e397011e0) 2013-04-17 07:26:04 +04:00			`tickledir="$CTDB_VARDIR/state/tickles"`
			`mkdir -p "$tickledir"`
NFS tickles: use addtickle/deltickle instead of shared tickle directory. This adds a new function update_tickles() that tracks tickles for a given port using the new ctdb addtickle/deltickle commands. This function is used in events.d/60.nfs to handle NFS tickles. events.d/61.nfstickle is removed. The /proc/sys/net/ipv4/tcp_tw_recycle setup is also moved to events.d/60.nfs. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dca4c4ebf3c35f8db3ae208efb7a83abbf726ed6) 2010-08-26 08:59:59 +04:00
			`# Who am I?`
			`_pnn=$(ctdb pnn) ; _pnn=${_pnn#PNN:}`

			`# What public IPs do I hold?`
ctdb-scripts: Update eventscripts to use ctdb -X instead of ctdb -Y Also update associated eventscript unit tests and ctdb stub. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-20 06:32:46 +03:00			`_ips=$(ctdb -X ip \| awk -F'\|' -v pnn=$_pnn '$3 == pnn {print $2}')`
NFS tickles: use addtickle/deltickle instead of shared tickle directory. This adds a new function update_tickles() that tracks tickles for a given port using the new ctdb addtickle/deltickle commands. This function is used in events.d/60.nfs to handle NFS tickles. events.d/61.nfstickle is removed. The /proc/sys/net/ipv4/tcp_tw_recycle setup is also moved to events.d/60.nfs. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dca4c4ebf3c35f8db3ae208efb7a83abbf726ed6) 2010-08-26 08:59:59 +04:00
			`# IPs as a regexp choice`
			`_ipschoice="($(echo $_ips \| sed -e 's/ /\|/g' -e 's/\./\\\\./g'))"`

			`# Record connections to our public IPs in a temporary file`
			`_my_connections="${tickledir}/${_port}.connections"`
			`rm -f "$_my_connections"`
			`netstat -tn \|`
			`awk -v destpat="^${_ipschoice}:${_port}\$" \`
			`'$1 == "tcp" && $6 == "ESTABLISHED" && $4 ~ destpat {print $5, $4}' \|`
			`sort >"$_my_connections"`

			`# Record our current tickles in a temporary file`
			`_my_tickles="${tickledir}/${_port}.tickles"`
			`rm -f "$_my_tickles"`
			`for _i in $_ips ; do`
ctdb-scripts: Update eventscripts to use ctdb -X instead of ctdb -Y Also update associated eventscript unit tests and ctdb stub. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-11-20 06:32:46 +03:00			`ctdb -X gettickles $_i $_port \|`
			`awk -F'\|' 'NR > 1 { printf "%s:%s %s:%s\n", $2, $3, $4, $5 }'`
NFS tickles: use addtickle/deltickle instead of shared tickle directory. This adds a new function update_tickles() that tracks tickles for a given port using the new ctdb addtickle/deltickle commands. This function is used in events.d/60.nfs to handle NFS tickles. events.d/61.nfstickle is removed. The /proc/sys/net/ipv4/tcp_tw_recycle setup is also moved to events.d/60.nfs. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dca4c4ebf3c35f8db3ae208efb7a83abbf726ed6) 2010-08-26 08:59:59 +04:00			`done \|`
			`sort >"$_my_tickles"`

			`# Add tickles for connections that we haven't already got tickles for`
			`comm -23 "$_my_connections" "$_my_tickles" \|`
			`while read _src _dst ; do`
			`ctdb addtickle $_src $_dst`
			`done`

			`# Remove tickles for connections that are no longer there`
			`comm -13 "$_my_connections" "$_my_tickles" \|`
			`while read _src _dst ; do`
			`ctdb deltickle $_src $_dst`
			`done`

			`rm -f "$_my_connections" "$_my_tickles"`
			`}`

add possibility to provide site local modifications to the event system through a /etc/ctdb/rc.local script that is sources by /etc/ctdb/functions (This used to be ctdb commit a5b7dd97e3faf0c4f289240307d0e22a67cf2353) 2008-04-10 00:50:12 +04:00			`########################################################`
			`# load a site local config file`
			`########################################################`

Eventscripts: source a file specified by $CTDB_RC_LOCAL in functions file. Another unit testing hook. This is easier than dropping files into rc.local.d/ and then removing them. The file has to be executable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b13ac3bdaf326a6cdfd87da9195eb9630806c418) 2011-06-28 09:06:10 +04:00			`[ -n "$CTDB_RC_LOCAL" -a -x "$CTDB_RC_LOCAL" ] && {`
			`. "$CTDB_RC_LOCAL"`
			`}`

shell scripts need extra spaces sometime (This used to be ctdb commit f6409b19972fa94257af9aa51def539f639bc226) 2008-04-10 01:01:22 +04:00			`[ -x $CTDB_BASE/rc.local ] && {`
add possibility to provide site local modifications to the event system through a /etc/ctdb/rc.local script that is sources by /etc/ctdb/functions (This used to be ctdb commit a5b7dd97e3faf0c4f289240307d0e22a67cf2353) 2008-04-10 00:50:12 +04:00			`. $CTDB_BASE/rc.local`
			`}`
update the socketkiller in the eventscripts to be able to handle ipv6 (This used to be ctdb commit 6da7b36b7ccc4ee9b809867ea32036f09a801bb3) 2008-08-20 03:47:00 +04:00
add a direcotry where multiple local scripts can be added to run when executing eventscripts (This used to be ctdb commit 27d152a918680a59c7412aec7e1772f25b72d469) 2009-10-19 09:22:15 +04:00			`[ -d $CTDB_BASE/rc.local.d ] && {`
			`for i in $CTDB_BASE/rc.local.d/* ; do`
			`[ -x "$i" ] && . "$i"`
			`done`
			`}`

Eventscript argument cleanups and introduction of ctdb_standard_event_handler. The functions file no longer causes a side-effect by doing a shift. It also doesn't set a convenience variable for $1. All eventscripts now explicitly use "$1" in their case statement, as does the initscript. The absence of a shift means that the takeip/releaseip events now explicitly reference $2-$4 rather than $1-$3. New function ctdb_standard_event_handler handles the status and setstatus events, and exits for either of those events. It is called via a default case in each eventscript, replacing an explicit status case where applicable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d55408cbbb3bb71670b80f3dad5639ea0be5b5b) 2009-12-01 09:43:47 +03:00			`script_name="${0##*/}" # basename`
Now vaguely tested initscript updates. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f1e350f9edb74cc44b6c5be4c062fd93e98ba8c4) 2009-11-19 08:48:19 +03:00			`service_fail_limit=1`
Eventscripts: only autostart during a monitor event. Otherwise we might short-circuit events that are run only once and actually need to do something. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c4f9e8a43540bc049b2771e0a2d76d37b9d17331) 2011-01-11 09:12:03 +03:00			`event_name="$1"`

1463 lines 34 KiB Plaintext Raw Normal View History Unescape Escape

1463 lines

34 KiB

Plaintext

Raw Normal View History