1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-11 05:18:09 +03:00

Make ctdb_diagnostics more resilient to uncontactable nodes.

Current behaviour is for onnode to timeout (for about 20s) for each
attempted ssh to a down node.  With 40 or 50 invocations of onnode
this takes a long time.

2 changes to work around this:

* If EXTRA_SSH_OPTS (which is passed to ssh by onnode) does not
  contains a ConnectTimeout= setting then add a setting for a 5 second
  timeout.

* Filter the nodes before starting any diagnosis, taking out any "bad
  nodes" that are uncontactable via onnode.

  In the nodes summary at the beginning of the output, print
  information about any "bad nodes".

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 8c3b6427dbaade87e1a0f5590f0894c2e69b31a3)
This commit is contained in:
Martin Schwenke 2011-10-07 15:00:42 +11:00 committed by Ronnie Sahlberg
parent a19ec048ca
commit 71b8015ccf

View File

@ -18,6 +18,7 @@ EOF
}
nodes=$(ctdb listnodes -Y | cut -d: -f2)
bad_nodes=""
diff_opts=
no_ads=false
@ -45,6 +46,25 @@ parse_options ()
parse_options "$@"
# Use 5s ssh timeout if EXTRA_SSH_OPTS doesn't set a timeout.
case "$EXTRA_SSH_OPTS" in
*ConnectTimeout=*) : ;;
*)
export EXTRA_SSH_OPTS="${EXTRA_SSH_OPTS} -o ConnectTimeout=5"
esac
# Filter nodes. Remove any nodes we can't contact from $node and add
# them to $bad_nodes.
_nodes=""
for _i in $nodes ; do
if onnode $_i true >/dev/null 2>&1 ; then
_nodes="${_nodes}${_nodes:+ }${_i}"
else
bad_nodes="${bad_nodes}${bad_nodes:+,}${_i}"
fi
done
nodes="$_nodes"
nodes_comma=$(echo $nodes | sed -e 's@[[:space:]]@,@g')
PATH="$PATH:/sbin:/usr/sbin:/usr/lpp/mmfs/bin"
@ -138,11 +158,23 @@ NUM_ERRORS=0
cat <<EOF
Diagnosis started on these nodes:
$nodes_comma
EOF
if [ -n "$bad_nodes" ] ; then
cat <<EOF
NOT RUNNING DIAGNOSTICS on these uncontactable nodes:
$bad_nodes
EOF
fi
cat <<EOF
For reference, here is the nodes file on the current node...
EOF
show_file /etc/ctdb/nodes
show_file /etc/ctdb/nodes
cat <<EOF
--------------------------------------------------------------------