glusterfs/doc/features/ganesha-ha.md
Kaleb S KEITHLEY 40a24f5ab9 common-ha: reliable grace using pacemaker notify actions
Using *-dead_ip-1 resources to track on which nodes the ganesha.nfsd
had died was found to be unreliable.

Running `pcs status` in the ganesha_grace monitor action was seen to
time out during failover; the HA devs opined that it was, generally,
not a good idea to run `pcs status` in a monitor action in any event.
They suggested using the notify feature, where the resources on all
the nodes are notified when a clone resource agent dies.

This change adds a notify action to the ganesha_grace RA. The ganesha_mon
RA monitors its ganesha.nfsd daemon. While the daemon is running, it
creates two attributes: ganesha-active and grace-active. When the daemon
stops for any reason, the attributes are deleted. Deleting the
ganesha-active attribute triggers the failover of the virtual IP (the
IPaddr RA) to another node where ganesha.nfsd is still running. The
ganesha_grace RA monitors the grace-active attribute. When the
grace-active attibute is deleted, the ganesha_grace RA stops, and will
not restart. This triggers pacemaker to trigger the notify action in
the ganesha_grace RAs on the other nodes in the cluster; which send a
DBUS message to their ganesha.nfsd.

(N.B. grace-active is a bit of a misnomer. while the grace-active
attribute exists, everything is normal and healthy. Deleting the
attribute triggers putting the surviving ganesha.nfsds into GRACE.)

To ensure that the remaining/surviving ganesha.nfsds are put into
NFS-GRACE before the IPaddr (virtual IP) fails over there is a short
delay (sleep) between deleting the grace-active attribute and the
ganesha-active attribute. To summarize:
  1. on node 2 ganesha_mon:monitor notices that ganesha.nfsd has died
  2. on node 2 ganesha_mon:monitor deletes its grace-active attribute
  3. on node 2 ganesha_grace:monitor notices that grace-active is gone
     and returns OCF_ERR_GENERIC, a.k.a. new error. When pacemaker
     tries to (re)start ganesha_grace, its start action will return
     OCF_NOT_RUNNING, a.k.a. known error, don't attempt further
     restarts.
  4. on nodes 1, 3, etc., ganesha_grace:notify receives a post-stop
     notification indicating that node 2 is gone, and sends a DBUS
     message to its ganesha.nfsd putting it into NFS-GRACE.
  5. on node 2 ganesha_mon:monitor waits a short period, then deletes
     its ganesha-active attribute. This triggers the IPaddr (virt IP)
     failover according to constraint location rules.

ganesha_nfsd modified to run for the duration, start action is invoked
to setup the /var/lib/nfs symlink, stop action is invoked to restore it.
ganesha-ha.sh modified accordingly to create it as a clone resource.

BUG: 1290865
Change-Id: I1ba24f38fa4338b3aeb17c65645e9f439387ff57
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/12964
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-on: http://review.gluster.org/13725
2016-03-14 21:34:27 -07:00

44 lines
2.0 KiB
Markdown

# Overview of Ganesha HA Resource Agents in GlusterFS 3.7
The ganesha_mon RA monitors its ganesha.nfsd daemon. While the
daemon is running, it creates two attributes: ganesha-active and
grace-active. When the daemon stops for any reason, the attributes
are deleted. Deleting the ganesha-active attribute triggers the
failover of the virtual IP (the IPaddr RA) to another node —
according to constraint location rules — where ganesha.nfsd is
still running.
The ganesha_grace RA monitors the grace-active attribute. When
the grace-active attibute is deleted, the ganesha_grace RA stops,
and will not restart. This triggers pacemaker to invoke the notify
action in the ganesha_grace RAs on the other nodes in the cluster;
which send a DBUS message to their respective ganesha.nfsd.
(N.B. grace-active is a bit of a misnomer. while the grace-active
attribute exists, everything is normal and healthy. Deleting the
attribute triggers putting the surviving ganesha.nfsds into GRACE.)
To ensure that the remaining/surviving ganesha.nfsds are put into
NFS-GRACE before the IPaddr (virtual IP) fails over there is a
short delay (sleep) between deleting the grace-active attribute
and the ganesha-active attribute. To summarize, e.g. in a four
node cluster:
1. on node 2 ganesha_mon::monitor notices that ganesha.nfsd has died
2. on node 2 ganesha_mon::monitor deletes its grace-active attribute
3. on node 2 ganesha_grace::monitor notices that grace-active is gone
and returns OCF_ERR_GENERIC, a.k.a. new error. When pacemaker tries
to (re)start ganesha_grace, its start action will return
OCF_NOT_RUNNING, a.k.a. known error, don't attempt further restarts.
4. on nodes 1, 3, and 4, ganesha_grace::notify receives a post-stop
notification indicating that node 2 is gone, and sends a DBUS message
to its ganesha.nfsd, putting it into NFS-GRACE.
5. on node 2 ganesha_mon::monitor waits a short period, then deletes
its ganesha-active attribute. This triggers the IPaddr (virt IP)
failover according to constraint location rules.