From d8e2ddc5a823b676b6f040852c0f53220ba0ada2 Mon Sep 17 00:00:00 2001 From: Martin Schwenke Date: Mon, 12 Oct 2009 16:17:37 +1100 Subject: [PATCH 1/2] 40.vsftpd: reset the fail counter in the "recovered" event. Each recovery that involves IP reassignments results in a restart of vsftpd in the "recovered" event. Currently, we can have several recoveries in quick succession and the "monitor" event following each can fail because vsftpd isn't ready yet. This results in cumulative failures, so the node is marked unhealthy, even though vsftpd has never had a proper opportunity to become ready. This resets the fail count after each recovery. While we're here, also move the delete of the restart flag file into the body of the conditional. Signed-off-by: Martin Schwenke (This used to be ctdb commit 318abeb4b913a8d846e7eaf4cf5c2a67b61ce974) --- ctdb/config/events.d/40.vsftpd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ctdb/config/events.d/40.vsftpd b/ctdb/config/events.d/40.vsftpd index bec786254ae..315c150262a 100755 --- a/ctdb/config/events.d/40.vsftpd +++ b/ctdb/config/events.d/40.vsftpd @@ -44,9 +44,9 @@ case $cmd in [ -f $CTDB_BASE/state/vsftpd/restart ] && { service vsftpd stop > /dev/null 2>&1 service vsftpd start + /bin/rm -f $CTDB_BASE/state/vsftpd/restart 2>/dev/null + ctdb_counter_init "$VSFTPD_FAILS" } >/dev/null 2>&1 - - /bin/rm -f $CTDB_BASE/state/vsftpd/restart 2>/dev/null ;; monitor) From ab98c1b0f100d2bed1a5def42a5b6dde01cc904c Mon Sep 17 00:00:00 2001 From: Martin Schwenke Date: Mon, 12 Oct 2009 16:32:49 +1100 Subject: [PATCH 2/2] Clean up ctdb_check_directories* eventscript functions. There are 2 problems with this code: * The loop in ctdb_check_directories_probe() breaks on filenames containing whitespace. The fix to protect them is to pass "$@" to this function and have it operate on "$@". Note that there's still a problem with whitespace in filenames in the 50.samba eventscript. To fix this ctdb_check_directories_probe should read the filenames from stdin. Another time... * The check for '%' in filenames in ctdb_check_directories_probe() ends up involving several forks. On a modern machine this can cost a couple of minutes when checking a large number of directories. The fix is to use a case statement. Signed-off-by: Martin Schwenke (This used to be ctdb commit eb1fecaef9aa5cb85dff7d4f7af8a9878deabed8) --- ctdb/config/functions | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/ctdb/config/functions b/ctdb/config/functions index bec4815c431..1117189924f 100644 --- a/ctdb/config/functions +++ b/ctdb/config/functions @@ -167,17 +167,20 @@ ctdb_check_rpc() { ###################################################### # check a set of directories is available -# return 0 on a missing directory +# return 1 on a missing directory # usage: ctdb_check_directories_probe SERVICE_NAME ###################################################### ctdb_check_directories_probe() { service_name="$1" shift - wait_dirs="$*" - [ -z "$wait_dirs" ] && return; - for d in $wait_dirs; do - ( echo $d | grep -q '%' ) && continue - [ -d $d ] || return 1 + for d ; do + case "$d" in + *%*) + continue + ;; + *) + [ -d "$d" ] || return 1 + esac done return 0 } @@ -187,10 +190,8 @@ ctdb_check_directories_probe() { # usage: ctdb_check_directories SERVICE_NAME ###################################################### ctdb_check_directories() { - service_name="$1" - shift - wait_dirs="$*" - ctdb_check_directories_probe "$service_name" $wait_dirs || { + # Note: ctdb_check_directories_probe sets both $service_name and $d. + ctdb_check_directories_probe "$@" || { echo "ERROR: $service_name directory $d not available" exit 1 }