1
0
mirror of https://github.com/samba-team/samba.git synced 2025-02-14 01:57:53 +03:00
Martin Schwenke 45d55dc25b ctdb-recovery: Ban a node that causes recovery failure
... instead of applying banning credits.

There have been a couple of cases where recovery repeatedly takes just
over 2 minutes to fail.  Therefore, banning credits expire between
failures and a continuously problematic node is never banned,
resulting in endless recoveries.  This is because it takes 2
applications of banning credits before a node is banned, which
generally involves 2 recovery failures.

The recovery helper makes up to 3 attempts to recover each database
during a single run.  If a node causes 3 failures then this is really
equivalent to 3 recovery failures in the model that existed before the
recovery helper added retries.  In that case the node would have been
banned after 2 failures.

So, instead of applying banning credits to the "most failing" node,
simply ban it directly from the recovery helper.

If multiple nodes are causing recovery failures then this can cause a
node to be banned more quickly than it might otherwise have been, even
pre-recovery-helper.  However, 90 seconds (i.e. 3 failures) is a long
time to be in recovery, so banning earlier seems like the best
approach.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13670

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Nov  5 06:52:33 CET 2018 on sn-devel-144

(cherry picked from commit 27df4f002a594dbb2f2a38afaccf3e22f19818e1)
2018-11-06 09:10:23 +01:00
..
2016-08-10 08:18:16 +02:00
2018-07-10 10:44:13 +02:00
2007-06-07 22:16:48 +10:00
2009-02-07 08:10:34 +11:00

This is the release version of CTDB, a clustered implementation of TDB
database used by Samba and other projects to store temporary data.

This software is freely distributable under the GNU public license,
a copy of which you should have received with this software (in a file
called COPYING).

For documentation on CTDB, please visit CTDB website http://ctdb.samba.org.