samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-22 13:34:15 +03:00

221 lines

7.6 KiB

Plaintext

Raw Normal View History

added hooks to make nfs statd behave correctly on failover (This used to be ctdb commit a1ee84fc47892b6c18d417ccf714211fcb07952e) 2007-05-31 05:09:45 +04:00			`#!/bin/sh`

statd-callout: Make sure statd callout script always runs as root In RHEL 6+, rpc.statd runs as "rpcuser" instead of root as on RHEL 5. This prevents CTDB tool commands talking to daemon since "rpcuser" cannot access CTDB socket. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fe8c4880b371492a38554868d4ca10918c54e412) 2013-04-03 07:44:08 +04:00			`# This must run as root as CTDB tool commands need to access CTDB socket`
ctdb-scripts: Avoid shellcheck warnings SC2046, SC2086 (double-quoting) SC2046: Quote this to prevent word splitting. SC2086: Double quote to prevent globbing and word splitting. Add some quoting where it makes sense. Use shellcheck directives for false-positives. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-07-06 10:31:51 +03:00			`[ "$(id -u)" -eq 0 ] \|\| exec sudo "$0" "$@"`
statd-callout: Make sure statd callout script always runs as root In RHEL 6+, rpc.statd runs as "rpcuser" instead of root as on RHEL 5. This prevents CTDB tool commands talking to daemon since "rpcuser" cannot access CTDB socket. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fe8c4880b371492a38554868d4ca10918c54e412) 2013-04-03 07:44:08 +04:00
split out events for each subsystem separately (This used to be ctdb commit 03c629a72f234dcc783fa1085e7edba09597c241) 2007-06-01 14:54:26 +04:00			`# this script needs to be installed so that statd points to it with the -H`
docs on how to use statd-callout (This used to be ctdb commit 4a75111b4f3f93dc42c9ced2d23f3cc933712017) 2007-06-02 13:45:06 +04:00			`# command line argument. The easiest way to do that is to put something like this in`
			`# /etc/sysconfig/nfs:`
			`# STATD_HOSTNAME="myhostname -H /etc/ctdb/statd-callout"`

scripts: statd-callout should calculate CTDB_BASE if it is not set Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 376015ba5ad6b7703ae9949a1d40a0c72dfaba0c) 2013-01-03 08:33:10 +04:00			`[ -n "$CTDB_BASE" ] \|\| \`
ctdb-scripts: Update script boilerplate to avoid shellcheck warnings * Assign the output of dirname to temporary variable to avoid word splitting when directory name contains whitespace * Drop export of CTDB_BASE to avoid masking broken return value - functions file does the export anyway * Quote path when including functions file Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-29 10:36:05 +03:00			`CTDB_BASE=$(d=$(dirname "$0") ; cd -P "$d" ; dirname "$PWD")`
cope with non-standard install dirs in event scripts (This used to be ctdb commit 52fff5345873690a9cc86495f414343eaa3bd540) 2007-09-14 08:14:03 +04:00
ctdb-scripts: Update script boilerplate to avoid shellcheck warnings * Assign the output of dirname to temporary variable to avoid word splitting when directory name contains whitespace * Drop export of CTDB_BASE to avoid masking broken return value - functions file does the export anyway * Quote path when including functions file Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-29 10:36:05 +03:00			`. "${CTDB_BASE}/functions"`
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00
			`# Overwrite this so we get some logging`
			`die ()`
			`{`
			`script_log "statd-callout" "$@"`
			`exit 1`
			`}`

config: load 'ctdb' config before 'nfs' config in statd-callout All other scripts do 'loadconfig ctdb' before any other 'loadconfig foo' call. I think we should do the same in statd-callout. Otherwise it's very confusing, if you have configured some Options in /etc/sysconfig/ctdb, but /etc/ctdb/statd-callout doesn't notice them. metze (This used to be ctdb commit 10d95581fb90bfdf58ec32345c4e36c27acf4f37) 2009-11-09 17:06:59 +03:00			`loadconfig ctdb`
make the init scripts more portable about location of system config files (This used to be ctdb commit 65f3e2bc722e314b2c51c3bfdc544b408a8a64cf) 2007-06-03 16:07:07 +04:00			`loadconfig nfs`
added hooks to make nfs statd behave correctly on failover (This used to be ctdb commit a1ee84fc47892b6c18d417ccf714211fcb07952e) 2007-05-31 05:09:45 +04:00
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00			`[ -n "$NFS_HOSTNAME" ] \|\| \`
			`die "NFS_HOSTNAME is not configured. statd-callout failed"`

			`# A handy newline`
			`nl="`
			`"`
60.nfs: we must always restart the lockmanager when the cluster has been reconfigured and ip addresses has changed. This is to make sure we get a clusterwide grace period for nfs locking. if we dont do this and only restart locking on the nodes that were direclty affected, a different client can take out a conflicting lock from a different node before affected clients has had a chance to reclaim all the locks lost during reconfigure. grace period on rhel5 kernel has bene increased to 90 seconds! statd-callout: we must restart lockmanager to ensure a clusterwide grace period for nfs. this makes locking "more correct" for nfs clients and prevents other clients/nodes from taking out a conflicting lock while a different client/node tries to reclaim lost locks. This makes it "almost consistent" for NFS clients but there is still the possibility that a cifs client can take out a conflicting lock before an nfs client has had a chance to reclaim an existing lock. This can not be solved with anything less than making the kernel nfs lock manager "samba aware" and making samba aware of the internal state of the kernel lock manager so that they can cooperate. we can not just stop/start the lockmanager back to back in rhel5 since if they are stopped/started too close to eachother then when the new lockmanager upon starting up sends out statd notifications two things can happen: 1, new lockmanager sends out notification BEFORE it has registered with portmapper leading to lockmanager starts lockmanager sends notification to the client client tries to recover the lock and tries to portmap the lockmanager port on the server. server is not (yet) registered with portmapper and server responds "no such program" to hte clients request to discover where lockmanager is. client then just completely gives up reclaiming the lock and doesnt even reattempt the portmapper call after some timeout. ==> lock reclaim failed. 2, if they are started back to back, and a client tries to reclaim the lock the lockmanager sometimes sends two responses back to back to the client. one with status NLM_GRANTED (==you got the lock reclaimed) and one with status NLM_DENIED (==you could not get the lock reclaimed) This confuses the client and leads to the server thinking that the client does have the lock and the client thinking it has not got the lock and orphaned locks result. We also send out additional notification messages of different formats to allow more legacy clients to interoperate with locking. (This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033) 2007-09-07 02:52:56 +04:00
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`ctdb_setup_service_state_dir "statd-callout"`

			`cd "$service_state_dir" \|\| \`
			`die "Failed to change directory to \"${service_state_dir}\""`

added hooks to make nfs statd behave correctly on failover (This used to be ctdb commit a1ee84fc47892b6c18d417ccf714211fcb07952e) 2007-05-31 05:09:45 +04:00			`case "$1" in`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`# Keep a single file to keep track of the last "add-client" or`
			`# "del-client'. These get pushed to ctdb.tdb during "update",`
			`# which will generally be run once each "monitor" cycle. In this`
			`# way we avoid scalability problems with flood of persistent`
			`# transactions after a "notify" when all the clients re-take their`
			`# locks.`

ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00			`add-client)`
			`# statd does not tell us to which IP the client connected so`
			`# we must add it to all the IPs that we serve`
			`cip="$2"`
ctdb-scripts: Changed uses of "ctdb xpnn" to ctdb_get_pnn() "ctdb xpnn" does not work when sysctl net.ipv4.ip_nonlocal_bind=1, since it determines the node by attempting to bind to each addres in the nodes file. The solution is to not use "ctdb xpnn". After the initial call, ctdb_get_pnn() will be more efficient that "ctdb xpnn". Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-04-18 15:00:49 +03:00			`ctdb_get_pnn`
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00			`date=$(date '+%s')`
ctdb-scripts: Avoid shellcheck warning SC2034 (unused variables) SC2034: VAR appears unused. Verify it or export it. Drop some variables that are unnecessarily used. Use shellcheck directive for false-positives. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-07-06 10:16:44 +03:00			`# x is intentionally ignored`
			`# shellcheck disable=SC2034`
ctdb: use properly configured ctdb in statd-callout Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-06-08 13:28:56 +03:00			`$CTDB ip -X \|`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`tail -n +2 \|`
			`while IFS="\|" read x sip node x ; do`
			`[ "$node" = "$pnn" ] \|\| continue # not us`
			`key="statd-state@${sip}@${cip}"`
			`echo "\"${key}\" \"${date}\"" >"$key"`
			`done`
added hooks to make nfs statd behave correctly on failover (This used to be ctdb commit a1ee84fc47892b6c18d417ccf714211fcb07952e) 2007-05-31 05:09:45 +04:00			`;;`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00
			`del-client)`
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00			`# statd does not tell us from which IP the client disconnected`
			`# so we must add it to all the IPs that we serve`
			`cip="$2"`
ctdb-scripts: Changed uses of "ctdb xpnn" to ctdb_get_pnn() "ctdb xpnn" does not work when sysctl net.ipv4.ip_nonlocal_bind=1, since it determines the node by attempting to bind to each addres in the nodes file. The solution is to not use "ctdb xpnn". After the initial call, ctdb_get_pnn() will be more efficient that "ctdb xpnn". Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-04-18 15:00:49 +03:00			`ctdb_get_pnn`
ctdb-scripts: Avoid shellcheck warning SC2034 (unused variables) SC2034: VAR appears unused. Verify it or export it. Drop some variables that are unnecessarily used. Use shellcheck directive for false-positives. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-07-06 10:16:44 +03:00			`# x is intentionally ignored`
			`# shellcheck disable=SC2034`
ctdb: use properly configured ctdb in statd-callout Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-06-08 13:28:56 +03:00			`$CTDB ip -X \|`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`tail -n +2 \|`
			`while IFS="\|" read x sip node x ; do`
			`[ "$node" = "$pnn" ] \|\| continue # not us`
			`key="statd-state@${sip}@${cip}"`
			`echo "\"${key}\" \"\"" >"$key"`
			`done`
added hooks to make nfs statd behave correctly on failover (This used to be ctdb commit a1ee84fc47892b6c18d417ccf714211fcb07952e) 2007-05-31 05:09:45 +04:00			`;;`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00
			`update)`
			`files=$(echo statd-state@*)`
			`if [ "$files" = "statd-state@*" ] ; then`
			`# No files!`
			`exit 0`
			`fi`
			`# Filter out lines for any IP addresses that are not currently`
			`# hosted public IP addresses.`
ctdb-scripts: Changed uses of "ctdb xpnn" to ctdb_get_pnn() "ctdb xpnn" does not work when sysctl net.ipv4.ip_nonlocal_bind=1, since it determines the node by attempting to bind to each addres in the nodes file. The solution is to not use "ctdb xpnn". After the initial call, ctdb_get_pnn() will be more efficient that "ctdb xpnn". Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-04-18 15:00:49 +03:00			`ctdb_get_pnn`
ctdb: use properly configured ctdb in statd-callout Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-06-08 13:28:56 +03:00			`ctdb_ips=$($CTDB ip \| tail -n +2)`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`sed_expr=$(echo "$ctdb_ips" \|`
ctdb-scripts: Quote some variable expansions This avoids relevant shellcheck warnings. This is most of the shellcheck low hanging fruit in the non-test code. Many of the other warnings produced by shellcheck are either false positives, are non-trivial to fix or a fix may result in worse code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Jul 6 08:15:49 CEST 2016 on sn-devel-144 2016-06-29 11:11:44 +03:00			`awk -v pnn="$pnn" 'pnn == $2 { \`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`ip = $1; gsub(/\./, "\\.", ip); \`
			`printf "/statd-state@%s@/p\n", ip }')`
ctdb-scripts: Avoid shellcheck warnings SC2046, SC2086 (double-quoting) SC2046: Quote this to prevent word splitting. SC2086: Double quote to prevent globbing and word splitting. Add some quoting where it makes sense. Use shellcheck directives for false-positives. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-07-06 10:31:51 +03:00			`# Intentional multi-word expansion for multiple files`
			`# shellcheck disable=SC2086`
ctdb: use properly configured ctdb in statd-callout Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-06-08 13:28:56 +03:00			`if cat $files \| sed -n "$sed_expr" \| $CTDB ptrans "ctdb.tdb" ; then`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`rm $files`
			`fi`
			`;;`

ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00			`notify)`
60.nfs: we must always restart the lockmanager when the cluster has been reconfigured and ip addresses has changed. This is to make sure we get a clusterwide grace period for nfs locking. if we dont do this and only restart locking on the nodes that were direclty affected, a different client can take out a conflicting lock from a different node before affected clients has had a chance to reclaim all the locks lost during reconfigure. grace period on rhel5 kernel has bene increased to 90 seconds! statd-callout: we must restart lockmanager to ensure a clusterwide grace period for nfs. this makes locking "more correct" for nfs clients and prevents other clients/nodes from taking out a conflicting lock while a different client/node tries to reclaim lost locks. This makes it "almost consistent" for NFS clients but there is still the possibility that a cifs client can take out a conflicting lock before an nfs client has had a chance to reclaim an existing lock. This can not be solved with anything less than making the kernel nfs lock manager "samba aware" and making samba aware of the internal state of the kernel lock manager so that they can cooperate. we can not just stop/start the lockmanager back to back in rhel5 since if they are stopped/started too close to eachother then when the new lockmanager upon starting up sends out statd notifications two things can happen: 1, new lockmanager sends out notification BEFORE it has registered with portmapper leading to lockmanager starts lockmanager sends notification to the client client tries to recover the lock and tries to portmap the lockmanager port on the server. server is not (yet) registered with portmapper and server responds "no such program" to hte clients request to discover where lockmanager is. client then just completely gives up reclaiming the lock and doesnt even reattempt the portmapper call after some timeout. ==> lock reclaim failed. 2, if they are started back to back, and a client tries to reclaim the lock the lockmanager sometimes sends two responses back to back to the client. one with status NLM_GRANTED (==you got the lock reclaimed) and one with status NLM_DENIED (==you could not get the lock reclaimed) This confuses the client and leads to the server thinking that the client does have the lock and the client thinking it has not got the lock and orphaned locks result. We also send out additional notification messages of different formats to allow more legacy clients to interoperate with locking. (This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033) 2007-09-07 02:52:56 +04:00			`# we must restart the lockmanager (on all nodes) so that we get`
Fix various spelling errors Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104 2015-07-27 00:02:57 +03:00			`# a clusterwide grace period (so other clients don't take out`
60.nfs: we must always restart the lockmanager when the cluster has been reconfigured and ip addresses has changed. This is to make sure we get a clusterwide grace period for nfs locking. if we dont do this and only restart locking on the nodes that were direclty affected, a different client can take out a conflicting lock from a different node before affected clients has had a chance to reclaim all the locks lost during reconfigure. grace period on rhel5 kernel has bene increased to 90 seconds! statd-callout: we must restart lockmanager to ensure a clusterwide grace period for nfs. this makes locking "more correct" for nfs clients and prevents other clients/nodes from taking out a conflicting lock while a different client/node tries to reclaim lost locks. This makes it "almost consistent" for NFS clients but there is still the possibility that a cifs client can take out a conflicting lock before an nfs client has had a chance to reclaim an existing lock. This can not be solved with anything less than making the kernel nfs lock manager "samba aware" and making samba aware of the internal state of the kernel lock manager so that they can cooperate. we can not just stop/start the lockmanager back to back in rhel5 since if they are stopped/started too close to eachother then when the new lockmanager upon starting up sends out statd notifications two things can happen: 1, new lockmanager sends out notification BEFORE it has registered with portmapper leading to lockmanager starts lockmanager sends notification to the client client tries to recover the lock and tries to portmap the lockmanager port on the server. server is not (yet) registered with portmapper and server responds "no such program" to hte clients request to discover where lockmanager is. client then just completely gives up reclaiming the lock and doesnt even reattempt the portmapper call after some timeout. ==> lock reclaim failed. 2, if they are started back to back, and a client tries to reclaim the lock the lockmanager sometimes sends two responses back to back to the client. one with status NLM_GRANTED (==you got the lock reclaimed) and one with status NLM_DENIED (==you could not get the lock reclaimed) This confuses the client and leads to the server thinking that the client does have the lock and the client thinking it has not got the lock and orphaned locks result. We also send out additional notification messages of different formats to allow more legacy clients to interoperate with locking. (This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033) 2007-09-07 02:52:56 +04:00			`# conflicting locks through other nodes before all locks have been`
			`# reclaimed)`

			`# we need these settings to make sure that no tcp connections survive`
			`# across a very fast failover/failback`
dont set parameters in statd-callout if they should be set they bshould be set from 10.interfaces (This used to be ctdb commit 0c7c2dae0a976922de58793d576855bc37cd38e1) 2007-10-22 04:18:38 +04:00			`#echo 10 > /proc/sys/net/ipv4/tcp_fin_timeout`
dont set some of the sysctl variables in statd-callout. these are mainly useful for avoiding ack-storms when doing very rapid failover/failback during testing but should not be required in real-world. this gets rid of a lof of annoying messages from the messages file (This used to be ctdb commit 50d289dcce2caa7c7be9b6faa3b38b69c2237038) 2007-10-21 00:42:33 +04:00			`#echo 0 > /proc/sys/net/ipv4/tcp_max_tw_buckets`
			`#echo 0 > /proc/sys/net/ipv4/tcp_max_orphans`
60.nfs: we must always restart the lockmanager when the cluster has been reconfigured and ip addresses has changed. This is to make sure we get a clusterwide grace period for nfs locking. if we dont do this and only restart locking on the nodes that were direclty affected, a different client can take out a conflicting lock from a different node before affected clients has had a chance to reclaim all the locks lost during reconfigure. grace period on rhel5 kernel has bene increased to 90 seconds! statd-callout: we must restart lockmanager to ensure a clusterwide grace period for nfs. this makes locking "more correct" for nfs clients and prevents other clients/nodes from taking out a conflicting lock while a different client/node tries to reclaim lost locks. This makes it "almost consistent" for NFS clients but there is still the possibility that a cifs client can take out a conflicting lock before an nfs client has had a chance to reclaim an existing lock. This can not be solved with anything less than making the kernel nfs lock manager "samba aware" and making samba aware of the internal state of the kernel lock manager so that they can cooperate. we can not just stop/start the lockmanager back to back in rhel5 since if they are stopped/started too close to eachother then when the new lockmanager upon starting up sends out statd notifications two things can happen: 1, new lockmanager sends out notification BEFORE it has registered with portmapper leading to lockmanager starts lockmanager sends notification to the client client tries to recover the lock and tries to portmap the lockmanager port on the server. server is not (yet) registered with portmapper and server responds "no such program" to hte clients request to discover where lockmanager is. client then just completely gives up reclaiming the lock and doesnt even reattempt the portmapper call after some timeout. ==> lock reclaim failed. 2, if they are started back to back, and a client tries to reclaim the lock the lockmanager sometimes sends two responses back to back to the client. one with status NLM_GRANTED (==you got the lock reclaimed) and one with status NLM_DENIED (==you could not get the lock reclaimed) This confuses the client and leads to the server thinking that the client does have the lock and the client thinking it has not got the lock and orphaned locks result. We also send out additional notification messages of different formats to allow more legacy clients to interoperate with locking. (This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033) 2007-09-07 02:52:56 +04:00
Fix various spelling errors Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104 2015-07-27 00:02:57 +03:00			`# Delete the notification list for statd, we don't want it to`
Remove the dependency on the underlying cluster filesystem for handling the clusterwide persistent data associated with the lock manager and statd notifications. Use persistent databases to store this data instead of a shared directory. (This used to be ctdb commit fc0678d351187cfa4c71123f97c0f493aacd5d16) 2010-08-30 12:13:28 +04:00			`# ping any clients`
60.nfs: we must always restart the lockmanager when the cluster has been reconfigured and ip addresses has changed. This is to make sure we get a clusterwide grace period for nfs locking. if we dont do this and only restart locking on the nodes that were direclty affected, a different client can take out a conflicting lock from a different node before affected clients has had a chance to reclaim all the locks lost during reconfigure. grace period on rhel5 kernel has bene increased to 90 seconds! statd-callout: we must restart lockmanager to ensure a clusterwide grace period for nfs. this makes locking "more correct" for nfs clients and prevents other clients/nodes from taking out a conflicting lock while a different client/node tries to reclaim lost locks. This makes it "almost consistent" for NFS clients but there is still the possibility that a cifs client can take out a conflicting lock before an nfs client has had a chance to reclaim an existing lock. This can not be solved with anything less than making the kernel nfs lock manager "samba aware" and making samba aware of the internal state of the kernel lock manager so that they can cooperate. we can not just stop/start the lockmanager back to back in rhel5 since if they are stopped/started too close to eachother then when the new lockmanager upon starting up sends out statd notifications two things can happen: 1, new lockmanager sends out notification BEFORE it has registered with portmapper leading to lockmanager starts lockmanager sends notification to the client client tries to recover the lock and tries to portmap the lockmanager port on the server. server is not (yet) registered with portmapper and server responds "no such program" to hte clients request to discover where lockmanager is. client then just completely gives up reclaiming the lock and doesnt even reattempt the portmapper call after some timeout. ==> lock reclaim failed. 2, if they are started back to back, and a client tries to reclaim the lock the lockmanager sometimes sends two responses back to back to the client. one with status NLM_GRANTED (==you got the lock reclaimed) and one with status NLM_DENIED (==you could not get the lock reclaimed) This confuses the client and leads to the server thinking that the client does have the lock and the client thinking it has not got the lock and orphaned locks result. We also send out additional notification messages of different formats to allow more legacy clients to interoperate with locking. (This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033) 2007-09-07 02:52:56 +04:00			`rm -f /var/lib/nfs/statd/sm/*`
			`rm -f /var/lib/nfs/statd/sm.bak/*`
add a short delay after stopping nfslock to make it less likely that "weird" things happen (This used to be ctdb commit 4934c083cbcc19714094e08a0b7da1fb6fdc8a5a) 2007-09-07 06:14:53 +04:00
60.nfs: we must always restart the lockmanager when the cluster has been reconfigured and ip addresses has changed. This is to make sure we get a clusterwide grace period for nfs locking. if we dont do this and only restart locking on the nodes that were direclty affected, a different client can take out a conflicting lock from a different node before affected clients has had a chance to reclaim all the locks lost during reconfigure. grace period on rhel5 kernel has bene increased to 90 seconds! statd-callout: we must restart lockmanager to ensure a clusterwide grace period for nfs. this makes locking "more correct" for nfs clients and prevents other clients/nodes from taking out a conflicting lock while a different client/node tries to reclaim lost locks. This makes it "almost consistent" for NFS clients but there is still the possibility that a cifs client can take out a conflicting lock before an nfs client has had a chance to reclaim an existing lock. This can not be solved with anything less than making the kernel nfs lock manager "samba aware" and making samba aware of the internal state of the kernel lock manager so that they can cooperate. we can not just stop/start the lockmanager back to back in rhel5 since if they are stopped/started too close to eachother then when the new lockmanager upon starting up sends out statd notifications two things can happen: 1, new lockmanager sends out notification BEFORE it has registered with portmapper leading to lockmanager starts lockmanager sends notification to the client client tries to recover the lock and tries to portmap the lockmanager port on the server. server is not (yet) registered with portmapper and server responds "no such program" to hte clients request to discover where lockmanager is. client then just completely gives up reclaiming the lock and doesnt even reattempt the portmapper call after some timeout. ==> lock reclaim failed. 2, if they are started back to back, and a client tries to reclaim the lock the lockmanager sometimes sends two responses back to back to the client. one with status NLM_GRANTED (==you got the lock reclaimed) and one with status NLM_DENIED (==you could not get the lock reclaimed) This confuses the client and leads to the server thinking that the client does have the lock and the client thinking it has not got the lock and orphaned locks result. We also send out additional notification messages of different formats to allow more legacy clients to interoperate with locking. (This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033) 2007-09-07 02:52:56 +04:00			`# we must keep a monotonically increasing state variable for the entire`
			`# cluster so state always increases when ip addresses fail from one`
			`# node to another`
Remove the dependency on the underlying cluster filesystem for handling the clusterwide persistent data associated with the lock manager and statd notifications. Use persistent databases to store this data instead of a shared directory. (This used to be ctdb commit fc0678d351187cfa4c71123f97c0f493aacd5d16) 2010-08-30 12:13:28 +04:00			`# We use epoch and hope the nodes are close enough in clock.`
			`# Even numbers mean service is shut down, odd numbers mean`
			`# service is started.`
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00			`state_even=$(( $(date '+%s') / 2 * 2))`
60.nfs: we must always restart the lockmanager when the cluster has been reconfigured and ip addresses has changed. This is to make sure we get a clusterwide grace period for nfs locking. if we dont do this and only restart locking on the nodes that were direclty affected, a different client can take out a conflicting lock from a different node before affected clients has had a chance to reclaim all the locks lost during reconfigure. grace period on rhel5 kernel has bene increased to 90 seconds! statd-callout: we must restart lockmanager to ensure a clusterwide grace period for nfs. this makes locking "more correct" for nfs clients and prevents other clients/nodes from taking out a conflicting lock while a different client/node tries to reclaim lost locks. This makes it "almost consistent" for NFS clients but there is still the possibility that a cifs client can take out a conflicting lock before an nfs client has had a chance to reclaim an existing lock. This can not be solved with anything less than making the kernel nfs lock manager "samba aware" and making samba aware of the internal state of the kernel lock manager so that they can cooperate. we can not just stop/start the lockmanager back to back in rhel5 since if they are stopped/started too close to eachother then when the new lockmanager upon starting up sends out statd notifications two things can happen: 1, new lockmanager sends out notification BEFORE it has registered with portmapper leading to lockmanager starts lockmanager sends notification to the client client tries to recover the lock and tries to portmap the lockmanager port on the server. server is not (yet) registered with portmapper and server responds "no such program" to hte clients request to discover where lockmanager is. client then just completely gives up reclaiming the lock and doesnt even reattempt the portmapper call after some timeout. ==> lock reclaim failed. 2, if they are started back to back, and a client tries to reclaim the lock the lockmanager sometimes sends two responses back to back to the client. one with status NLM_GRANTED (==you got the lock reclaimed) and one with status NLM_DENIED (==you could not get the lock reclaimed) This confuses the client and leads to the server thinking that the client does have the lock and the client thinking it has not got the lock and orphaned locks result. We also send out additional notification messages of different formats to allow more legacy clients to interoperate with locking. (This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033) 2007-09-07 02:52:56 +04:00
ctdb-scripts: Parameterise 60.nfs with $CTDB_NFS_CALLOUT The goal is to have a single NFS eventscript. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-24 14:36:14 +03:00			`# We must also let some time pass between stopping and`
			`# restarting the lock manager. Otherwise there is a window`
			`# where the lock manager will respond "strangely" immediately`
			`# after restarting it, which causes clients to fail to reclaim`
			`# their locks.`
ctdb-scripts: Remove 60.ganesha, replace with callout for 60.nfs This isn't a straightforward move of code from 60.ganesha to the callout. Simplifications have been made to allow better interoperation with the new NFS checking logic. The following configuration variables have been removed: CTDB_GANESHA_REC_SUBDIR Edit NFS ganesha callout to change this location CTDB_NFS_SERVER_MODE, NFS_SERVER_MODE Use CTDB_NFS_CALLOUT instead CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK, CTDB_SKIP_GANESHA_NFSD_CHECK Disable the corresponding .check file instead Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-07-01 11:32:35 +03:00			`"$CTDB_NFS_CALLOUT" "stop" "nlockmgr" >/dev/null 2>&1`
			`sleep 2`
			`"$CTDB_NFS_CALLOUT" "start" "nlockmgr" >/dev/null 2>&1`
60.nfs: we must always restart the lockmanager when the cluster has been reconfigured and ip addresses has changed. This is to make sure we get a clusterwide grace period for nfs locking. if we dont do this and only restart locking on the nodes that were direclty affected, a different client can take out a conflicting lock from a different node before affected clients has had a chance to reclaim all the locks lost during reconfigure. grace period on rhel5 kernel has bene increased to 90 seconds! statd-callout: we must restart lockmanager to ensure a clusterwide grace period for nfs. this makes locking "more correct" for nfs clients and prevents other clients/nodes from taking out a conflicting lock while a different client/node tries to reclaim lost locks. This makes it "almost consistent" for NFS clients but there is still the possibility that a cifs client can take out a conflicting lock before an nfs client has had a chance to reclaim an existing lock. This can not be solved with anything less than making the kernel nfs lock manager "samba aware" and making samba aware of the internal state of the kernel lock manager so that they can cooperate. we can not just stop/start the lockmanager back to back in rhel5 since if they are stopped/started too close to eachother then when the new lockmanager upon starting up sends out statd notifications two things can happen: 1, new lockmanager sends out notification BEFORE it has registered with portmapper leading to lockmanager starts lockmanager sends notification to the client client tries to recover the lock and tries to portmap the lockmanager port on the server. server is not (yet) registered with portmapper and server responds "no such program" to hte clients request to discover where lockmanager is. client then just completely gives up reclaiming the lock and doesnt even reattempt the portmapper call after some timeout. ==> lock reclaim failed. 2, if they are started back to back, and a client tries to reclaim the lock the lockmanager sometimes sends two responses back to back to the client. one with status NLM_GRANTED (==you got the lock reclaimed) and one with status NLM_DENIED (==you could not get the lock reclaimed) This confuses the client and leads to the server thinking that the client does have the lock and the client thinking it has not got the lock and orphaned locks result. We also send out additional notification messages of different formats to allow more legacy clients to interoperate with locking. (This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033) 2007-09-07 02:52:56 +04:00
			`# we now need to send out additional statd notifications to ensure`
			`# that clients understand that the lockmanager has restarted.`
			`# we have three cases:`
			`# 1, clients that ignore the ip address the stat notification came from`
			`# and ONLY care about the 'name' in the notify packet.`
			`# these clients ONLY work with lock failover IFF that name`
			`# can be resolved into an ipaddress that matches the one used`
			`# to mount the share. (==linux clients)`
			`# This is handled when starting lockmanager above, but those`
			`# packets are sent from the "wrong" ip address, something linux`
			`# clients are ok with, buth other clients will barf at.`
			`# 2, Some clients only accept statd packets IFF they come from the`
			`# 'correct' ip address.`
			`# 2a,Send out the notification using the 'correct' ip address and also`
			`# specify the 'correct' hostname in the statd packet.`
			`# Some clients require both the correct source address and also the`
			`# correct name. (these clients also ONLY work if the ip addresses`
			`# used to map the share can be resolved into the name returned in`
			`# the notify packet.)`
			`# 2b,Other clients require that the source ip address of the notify`
			`# packet matches the ip address used to take out the lock.`
			`# I.e. that the correct source address is used.`
			`# These clients also require that the statd notify packet contains`
			`# the name as the ip address used when the lock was taken out.`
			`#`
			`# Both 2a and 2b are commonly used in lockmanagers since they maximize`
			`# probability that the client will accept the statd notify packet and`
			`# not just ignore it.`
Remove the dependency on the underlying cluster filesystem for handling the clusterwide persistent data associated with the lock manager and statd notifications. Use persistent databases to store this data instead of a shared directory. (This used to be ctdb commit fc0678d351187cfa4c71123f97c0f493aacd5d16) 2010-08-30 12:13:28 +04:00			`# For all IPs we serve, collect info and push to the config database`
ctdb-scripts: Changed uses of "ctdb xpnn" to ctdb_get_pnn() "ctdb xpnn" does not work when sysctl net.ipv4.ip_nonlocal_bind=1, since it determines the node by attempting to bind to each addres in the nodes file. The solution is to not use "ctdb xpnn". After the initial call, ctdb_get_pnn() will be more efficient that "ctdb xpnn". Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-04-18 15:00:49 +03:00			`ctdb_get_pnn`
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00
			`# Construct a sed expression to take catdb output and produce pairs of:`
			`# server-IP client-IP`
			`# but only for the server-IPs that are hosted on this node.`
ctdb: use properly configured ctdb in statd-callout Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-06-08 13:28:56 +03:00			`ctdb_all_ips=$($CTDB ip all \| tail -n +2)`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`sed_expr=$(echo "$ctdb_all_ips" \|`
ctdb-scripts: Quote some variable expansions This avoids relevant shellcheck warnings. This is most of the shellcheck low hanging fruit in the non-test code. Many of the other warnings produced by shellcheck are either false positives, are non-trivial to fix or a fix may result in worse code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Jul 6 08:15:49 CEST 2016 on sn-devel-144 2016-06-29 11:11:44 +03:00			`awk -v pnn="$pnn" 'pnn == $2 { \`
ctdb-scripts: Fix a regression in statd-callout Commit 4638010abb116aed0c180207aaa11475277aecb7 changed from using gensub() to gsub() in awk. However, it didn't halve the number of backslashes in the target strings. This is necessary because backslash is used in gensub() target strings to allow substitution of text matching parenthesised subexpressions. This is not the case with gsub(). So, halve the number of backslashes in the target string where gsub() is used in statd-callout. This is the only target string broken by changes made by the above commit Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-26 07:34:51 +03:00			`ip = $1; gsub(/\./, "\\.", ip); \`
ctdb-scripts: Don't use the GNU awk gensub() function This is a gawk extension and can't be used reliably if just running "awk". It is simple enough to switch to using the standard sub() and gsub() functions. The alternative is to switch to explicitly running "gawk". However, although the eventscripts aren't exactly portable, it is probably better to move closer to portability than further away. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> 2014-12-19 06:19:32 +03:00			`printf "s/^key.=.statd-state@\$%s\$@\$[^\"]\$./\\1 \\2/p\n", ip }')`
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00
ctdb: use properly configured ctdb in statd-callout Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-06-08 13:28:56 +03:00			`statd_state=$($CTDB catdb ctdb.tdb \| sed -n "$sed_expr" \| sort)`
ctdb-scripts: Add an early exit to statd-callout's notify case If $statd_state is empty then the loop will run once and print spurious errors. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-19 08:40:08 +04:00			`[ -n "$statd_state" ] \|\| exit 0`
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00
ctdb: Install helpers under libexecdir Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Andreas Schneider <asn@samba.org> 2016-02-12 02:12:13 +03:00			`smnotify="${CTDB_HELPER_BINDIR}/smnotify"`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`prev=""`
			`echo "$statd_state" \| {`
			`# This all needs to be in the same command group at the`
			`# end of the pipe so it doesn't get lost when the loop`
			`# completes.`
			`items=""`
			`while read sip cip ; do`
			`# Collect item to delete from the DB`
			`key="statd-state@${sip}@${cip}"`
			`item="\"${key}\" \"\""`
			`items="${items}${items:+${nl}}${item}"`
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`# NOTE: Consider optimising smnotify to read all the`
			`# data from stdin and then run it in the background.`

			`# Reset stateval for each serverip`
			`[ "$sip" = "$prev" ] \|\| stateval="$state_even"`
			`# Send notifies for server shutdown`
ctdb-scripts: Quote some variable expansions This avoids relevant shellcheck warnings. This is most of the shellcheck low hanging fruit in the non-test code. Many of the other warnings produced by shellcheck are either false positives, are non-trivial to fix or a fix may result in worse code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Jul 6 08:15:49 CEST 2016 on sn-devel-144 2016-06-29 11:11:44 +03:00			`"$smnotify" --client="$cip" --ip="$sip" \`
			`--server="$sip" --stateval="$stateval"`
			`"$smnotify" --client="$cip" --ip="$sip" \`
			`--server="$NFS_HOSTNAME" --stateval="$stateval"`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`# Send notifies for server startup`
ctdb-scripts: Avoid shellcheck warning SC2004 ($ in arithmetic) SC2004: $/${} is unnecessary on arithmetic variables. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-07-06 09:50:30 +03:00			`stateval=$((stateval + 1))`
ctdb-scripts: Quote some variable expansions This avoids relevant shellcheck warnings. This is most of the shellcheck low hanging fruit in the non-test code. Many of the other warnings produced by shellcheck are either false positives, are non-trivial to fix or a fix may result in worse code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Jul 6 08:15:49 CEST 2016 on sn-devel-144 2016-06-29 11:11:44 +03:00			`"$smnotify" --client="$cip" --ip="$sip" \`
			`--server="$sip" --stateval="$stateval"`
			`"$smnotify" --client="$cip" --ip="$sip" \`
			`--server="$NFS_HOSTNAME" --stateval="$stateval"`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`done`
ctdb-scripts: Rewrite statd-callout to avoid 10 minute lag This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-08 09:41:11 +04:00
ctdb: use properly configured ctdb in statd-callout Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-06-08 13:28:56 +03:00			`echo "$items" \| $CTDB ptrans "ctdb.tdb"`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`}`

			`# Remove any stale touch files (i.e. for IPs not currently`
			`# hosted on this node and created since the last "update").`
			`# There's nothing else we can do with them at this stage.`
			`echo "$ctdb_all_ips" \|`
ctdb-scripts: Quote some variable expansions This avoids relevant shellcheck warnings. This is most of the shellcheck low hanging fruit in the non-test code. Many of the other warnings produced by shellcheck are either false positives, are non-trivial to fix or a fix may result in worse code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Jul 6 08:15:49 CEST 2016 on sn-devel-144 2016-06-29 11:11:44 +03:00			`awk -v pnn="$pnn" 'pnn != $2 { print $1 }' \|`
ctdb-scripts: Change statd-callout to be more scalable Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-13 12:55:43 +03:00			`while read sip ; do`
			`rm -f "statd-state@${sip}@"*`
			`done`
added hooks to make nfs statd behave correctly on failover (This used to be ctdb commit a1ee84fc47892b6c18d417ccf714211fcb07952e) 2007-05-31 05:09:45 +04:00			`;;`
			`esac`

221 lines 7.6 KiB Plaintext Raw Normal View History Unescape Escape

221 lines

7.6 KiB

Plaintext

Raw Normal View History