IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Failures in startup/shutdown/releaseip/takeip are currently
incorrectly ignored.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12837
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Some configurations may set CTDB_NFS_CALLOUT to the empty string.
They may do this if they allow a choice of NFS implementations. In
this case the default call-out for Linux kernel NFS should be used.
However, statd-callout does not call nfs_callout_init() to set the
default. Therefore, statd-callout is unable to restart the lock
manager, so the grace period is never entered.
statd-callout must call nfs_callout_init() before trying to restart
the lock manager.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12589
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Feb 16 09:21:03 CET 2017 on sn-devel-144
Commit 0ca00267cd removed explicit
continuations in strings for awk programs. In one case this causes a
disconnect between condition and action, where an implicit
continuation does not work. This results in duplicate lines in the
rt_tables file.
Move the opening brace for the action to make the implicit
continuation work as expected.
An alternative would be to revert the removal of the explicit
continuations and add shellcheck tags. However, that doesn't mean
that an author of future code will necessarily use explicit
continuations, so the same mistake might still be make in the future.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12516
Reported-by: Barry Evans <bevans@pixitmedia.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This changed to "ctdb gratarp" some time ago but the scripts were
never updated.
Fix the documentation for the ctdb tool too.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12512
Reported-by: Ralph Böhme <slow@samba.org>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The debug() function, which is the only user of this variable, is no
longer used. It is also dropped.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is only used in 1 place, so just inline the check.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
They contain too much unnecessary complexity, some of which was used
to support CTDB_SERVICE_AUTOSTARTSTOP.
Also removed unused functions for service management.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The flag this sets is no longer used by ctdb_check_tcp_ports()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Commit 86792724a2 added complications on
top of the multiple TCP port checking methods that used to exist.
Life is simpler now and the cause of any failures is obvious. So just
print a simple message if the port check fails.
Tweak tests to match changes. Drop one test that becomes a duplicate.
Temporarily tweak ctdb_check_command() so that it passes shellcheck
tests. It will be removed anyway in a subsequent commit.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This has bit-rotted, at least for NFS. It can be fixed but it is
better to remove it because it adds a lot of unnecessary complexity.
Variable event_name becomes unused so remove it. Also remove
associated tests.
To continue to manage/unmanage services while CTDB is running:
* Start service by hand and then flag it as managed
* Mark service as unmanaged and shut it down by hand
In some cases CTDB does something fancy - e.g. start Samba under
"nice", so care is needed. One technique is to disable the
eventscript, mark as managed, run the startup event by hand and then
re-enable the eventscript.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This gets rid of implicit check if a service needs to configured. As a
side effect, we also get rid of the monitor "replay" which was
introduced to avoid a collision between a script executed via event and
manually. Event scripts are not expected to be run by hand.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This will help get rid of implicit ctdb_service_check_reconfigure.
We still need to keep "reconfigure" event in 13.per_ip_routing, so that
the per ip routing can be refreshed if the configuration has changed.
The correct fix for this is to add caching of configuration and checking
of configuration changes in "ipreallocated" event.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This is a regression introduced in f227c26178.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12407
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Nov 3 10:10:31 CET 2016 on sn-devel-144
Don't just rely on the process existing. It must be called "ctdbd".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Oct 14 11:54:40 CEST 2016 on sn-devel-144
The PID file has been used since CTDB 2.3. Assume that anyone
upgrading from an older version does a clean shutdown, upgrades CTDB
and then does a clean start (as opposed to upgrade CTDB and then
restart).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
kill_ctdbd() kills the daemon and then removes the PID file. This is
racy because a new daemon could write a new PID file in between the
kill and the removal. Reversing these steps would be an improvement.
However, none of the places where kill_ctdbd() is called is a safe
place to remove the PID file. There is always a chance that a new
daemon could start, write a new PID file and then kill_ctdbd() could
remove the new PID file.
ctdbd is able to overwrite a stale PID file by checking to see if it
is locked.
Therefore, entirely drop removal of the PID file from ctdbd_wrapper.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12287
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
If any processes remain then they may be stuck in D state and this
might tell us why.
Update tests: tweak pidof stub, add support for smbd stack traces and
add some new tests for the shutdown event.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Oct 10 12:54:24 CEST 2016 on sn-devel-144
https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ says:
network.target has very little meaning during start-up. [...]
Whether any network interfaces are already configured when it is
reached is undefined. [...]
network-online.target is a target that actively waits until the
ne[t]work is "up",
CTDB expects to be able to bind a socket to a node address and expects
interfaces for public IP addresses to exist. CTDB also doesn't expect
time to jump, so also wait until time is synchronised.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12255
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Steve French <sfrench@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Tested-by: Steve French <sfrench@samba.org>
Pipe the connections to "ctdb tickle" on stdin to avoid having to fork
so many "ctdb tickle" processes. This maintains the current level of
verbosity at the price of some extra memory usage.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Sep 8 10:45:42 CEST 2016 on sn-devel-144
SC2012: Use find instead of ls to better handle non-alphanumeric filenames.
Make this cope better with unexpected whitespace.
Unfortunately, this results in shellcheck warning:
SC2035: Use ./*.tdb.* so names with dashes won't become options.
No! Then stat(1) will print ./file.tdb.X. We want the basenames and
we know the filenames don't start with dashes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2012: Use find instead of ls to better handle non-alphanumeric filenames.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2038: Use -print0/-0 or -exec + to allow for non-alphanumeric filenames.
The suggested options aren't POSIX-compliant. This is more portable.
Base filenames can't have whitespace so rework to avoid problems with
whitespace in directory name.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2039: In POSIX sh, ulimit -c/-n is not supported.
Have shellcheck suppress the warnings. If -n is not supported then
don't set CTDB_MAX_OPEN_FILES. If packaging for a platform where -c
is not supported then remove this code and associated documentation.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2039: In POSIX sh, -nt is not supported.
This script is specific to the Linux NFS implementation. The -nt
operator is well supported in Linux shells (e.g. dash, bash, ksh).
The alternatives (e.g. using stat(1)) would result in less readable
code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2094: Make sure not to read and write the same file in the same pipeline.
The semantics here are unclear, so use a separate flock file in each
case for clarity.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2039: In POSIX sh, echo flags are not supported.
echo -n is well supported but the changes are simple.
Improve some logic, replace some instances with printf. Who knew
printf was in POSIX?
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2039: In POSIX sh, 'type' is not supported.
type is commonly supported and is more portable than which(1).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2015: Note that A && B || C is not if-then-else. C may run when A is
true.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2119: Use FUNC "$@" if function's $1 should meanscript's $1.
SC2120: FUNC references arguments, but none are ever passed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2017: Increase precision by replacing a/b*c with a*c/b.
This code intentionally rounds to an even value.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC1004: You don't break lines with \ in single quotes, it results in
literal backslash-linefeed.
These don't hurt, since awk can cope with the continuations. However,
they don't add anything.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2154: VAR is referenced but not assigned.
Change ctdb_setup_service_state_dir(), ctdb_get_pnn() and
ctdb_get_ip_address() to print the value so it can be assigned to a
variable. The performance gain from avoiding the sub-shells when
calling these functions is close to zero.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2046: Quote this to prevent word splitting.
SC2086: Double quote to prevent globbing and word splitting.
Add some quoting where it makes sense. Use shellcheck directives for
false-positives.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2034: VAR appears unused. Verify it or export it.
Drop some variables that are unnecessarily used. Use shellcheck
directive for false-positives.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2030: Modification of VAR is local (to subshell caused by (..) group).
SC2031: VAR was modified in a subshell. That change might be lost.
Fix a related, incorrect comment.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
SC2016: Expressions don't expand in single quotes, use double quotes for that.
Error messages are now arguably more readable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It is no longer used and adds needless complexity.
As a side-effect, the functions file can now be parsed by shellcheck.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes the logic more obvious.
Fix the (probably) accidental fall-through to the regular monitor
failure.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* Re-indent case labels as per new script style
Other indentation can be tweaked later as code changes, but the
labels are an obvious bulk change.
* Minor whitespace fixes
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It doesn't do anything. Add a comment to its definition to explain
why it is still there.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If there are insufficient arguments then they can't be shifted.
This function will be removed shortly. However, it needs to work for
now as tests will be added that depend on it to work.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This avoids relevant shellcheck warnings. This is most of the
shellcheck low hanging fruit in the non-test code. Many of the other
warnings produced by shellcheck are either false positives, are
non-trivial to fix or a fix may result in worse code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Jul 6 08:15:49 CEST 2016 on sn-devel-144
* Assign the output of dirname to temporary variable to avoid word
splitting when directory name contains whitespace
* Drop export of CTDB_BASE to avoid masking broken return value -
functions file does the export anyway
* Quote path when including functions file
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This avoids having to export it in every file that includes the
functions file.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Added so that nfs_check_services() could be run against an arbirary
directory. However, with the function moved to the event script, this
isn't useful. CTDB_NFS_CHECKS_DIR can be used for testing instead.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Sat Jun 11 10:23:03 CEST 2016 on sn-devel-144
This makes ctdbd_wrapper usable in non-standard installs.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This generates takeip-pre and releaseip-pre call-out events.
One use is to put NFS into grace before an IP is assigned to an
interface.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
A second NFS eventscript may be required, so make this code available
to it.
The initialisation code can't be evaluated in the functions file
because service_state_dir isn't yet setup, so put it in a function and
call it with other initialisation code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This doesn't need to print the family. Nothing uses it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Block and unblock IP addresses using these new functions. This makes
the code more readable.
The case statement in each function is very cheap, so there is no need
to prematurely optimise and pass the family.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
CTDB_INIT_STYLE isn't used in this script.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Kai Blin <kai@samba.org>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri May 20 21:06:18 CEST 2016 on sn-devel-144
Some Linux distributions don't have a "service" compatibility command.
To avoid breaking working systems, prefer the "service" compatibility
command just in case it does some extra, unexpected magic.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Kai Blin <kai@samba.org>
Add CTDB_NFS_STATE_FS_TYPE and CTDB_NFS_STATE_MNT config options, show use in
nfs-ganesha-callout. Since the callout script is only an example, we
officially don't have default values for these.
Signed-off-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
The underlying change is to allow the cluster mutex argstring to
optionally contain a helper command. When the argument string starts
with '!' then the first word is the helper command to run. This is
now the standard way of changing the helper from the default.
CTDB_CLUSTER_MUTEX_HELPER show now only be used to change the location
of the default helper when testing.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
When external monitoring is enabled for an NFS service using
service_check_cmd then $ctdb_check_rpc_out is empty because the
internal RPC checking isn't used. This results in empty log messages
like:
60.nfs: ERROR:
or:
60.nfs: WARNING:
Improve this so it at least says:
60.nfs: ERROR: monitoring service "statd" failed
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Don't do a percentage calculation for either memtotal or swaptotal if they
are zero.
Signed-off-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
ss with a filter is much faster than post-processing output from
netstat. CTDB already has a hard dependency on iproute2 for IP
address handling, so depending on ss is no big deal.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This currently causes monitor failure.
Log a warning instead. If there is a transient issue, such as NFS
being restarted in the background, then the thread count file should
be there the next time around so the count can be adjusted if
necessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
For "master", if there is a master then print the PNN, otherwise print
nothing.
For "list", print the PNN and IP addresses without a colon in between.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To keep this commit comprehensible, 91.lvs and the CTDB CLI tool are
temporarily inconsistent. The tool will be made consistent in a
subsequent commit.
LVS now uses a configuration file specified by CTDB_LVS_NODES and
supports the same slave-only syntax as CTDB_NATGW_NODES. LVS also
uses new variable CTDB_LVS_PUBLIC_IFACE instead of
CTDB_PUBLIC_INTERFACE.
Update unit tests and documentation.
Note that the --lvs and --single-public-ip daemon options are no
longer used. These will be removed and relevant documentation
updated in a subsequent commit.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Having both "recovered" and "ipreallocated" means that everything
happens twice when there is a recovery. No need for that.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Basic error redirection improvements before writing tests.
Deleting the service during "startup" will usually fail because the
service has never been setup, so redirect output to avoid logging an
error.
Similarly, deleting the service in "ipreallocated" will always fail
the first time, which would cause an error to be logged. Given the
simplicity of the script, there's no sane way to avoid the error
sometimes and log it if it actually matters. This could potentially
be tidied up in the future by making 91.lvs stateful, in a similar way
to 11.natgw.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
ctdb_killtcp will take up to 5 seconds to kill connections, so don't
wait in a loop. Just check if there are remaining connections on
completion and log a message either way.
Also add a test stub.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This will be needed for a rewrite of the connection killing code but
it is not used yet.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is unnecessary in Samba >= 4.0 because winbindd monitors IP
address itself and no longer needs to be told when they are dropped.
The smbcontrol commands can hang if a node has recovery mode active
because smbcontrol is unable to connect to the registry. Therefore,
the smbcontrol commands should be removed.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11719
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Ralph Boehme <slow@samba.org>
Autobuild-User(master): Ralph Böhme <slow@samba.org>
Autobuild-Date(master): Wed Feb 10 14:08:17 CET 2016 on sn-devel-144
Nothing checks it anymore.
This means that the NAT gateway capability in the daemon is now
unused.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To keep this commit comprehensible, 11.natgw and the CTDB CLI tool are
temporarily inconsistent. The tool will be made consistent in a
subsequent commit.
ctdb_natgw_slave_only() is reimplemented to check for the option in
the appropriate line in $CTDB_NATGW_NODES.
Update unit tests and documentation.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This has always been the case. Now it is documented and enforced.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Now its name describes its usage and the code reads better.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Each is now used in only one place and the logic is more obvious
without them.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reduces intentation by using early returns.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Jan 14 22:41:29 CET 2016 on sn-devel-144
Consider this sequence of events:
1. Instance of script running update_tickles() hangs
2. Script debugging is launched asynchronously
3. New instance of script is launched, creates temporary file(s)
4. Original hung script makes progress before asynchronous script
debugging kills it, so it removes temporary file(s)
5. New instance of script produces error due to missing files(s)
This is obviously rare.
Use more unique filenames to avoid step (4) removing the file(s)
belonging to other instances of the script.
This requires some extra cleanup to avoid too many temporary files
(which is why unique filenames were not originally usd). It is
sufficient to remove files modified at least 10 minutes ago.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
Whitespace and indentation improvements.
Remove comments describing events, since the README covers that much
better.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
The current code uses so many shell idioms that it is difficult to
follow.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
Move information about TCP connection tracking and resetting into
ctdb.7.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
exportfs can hang when, for example, DNS is flakey. Given that
exports don't change much, it makes sense to cache them.
Don't try to add error handling when exportfs fails but do print a
warning. Proper error handling can be added separately.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
These should be created elsewhere. If not then something is wrong, so
don't hide the problem.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Nov 20 04:40:26 CET 2015 on sn-devel-104
Various scripts (including debug_locks.sh, 00.ctdb, 05.system) need
CTDB_DBDIR to point to the right place... but it doesn't.
Move the rewriting of CTDB_DBDIR to loadconfig() so that it happens
for all scripts. Have this code set internal variable
CTDB_DBDIR_TMPFS_OPTIONS so that ctdbd_wrapper can do the mount.
This loses the generality that was present in dbdir_tmpfs_start() but
it wasn't being used anyway. If it is needed in the future then it
will be in the git history.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Nov 18 11:51:54 CET 2015 on sn-devel-104
The tmpfs is mounted and unmounted by ctdbd_wrapper. Format is
CTDB_DBDIR=tmpfs:<tmpfs-options>. The only default for the tmpfs is
mode=700 - to override, specify a different value in <tmpfs-options>.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Mon Nov 9 10:58:32 CET 2015 on sn-devel-104
This will make it easier to run things after CTDB is stopped.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Andrew Bartlett <abartlet@samba.org>
Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104
Since commit 84f5528d9b, CTDB will not
remove an existing socket if it can connect to the existing one.
Instead it will fail to start.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Oct 28 09:44:37 CET 2015 on sn-devel-104
If 2 attempts are made to start CTDB in quick succession then it is
possible for the 2nd attempt to remove a newly created PID file from
the 1st.
If the PID file existed then the PID/SID from ctdbd_is_running() will
be passed to kill_ctdbd(). If the PID file did not exist then there
is no point removing it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
In this case: ctdbd_wrapper, onnode, ctdb_diagnostics, ctdb.sudoers.
Set sensible defaults from configure options.
Update documentation to match, trying to fix up anything that has been
missed before.
The onnode unit tests need a symlink to the functions file.
The simple integration tests need to set CTDB_BASE and also
need symlinks to functions/nodes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
fixup
Signed-off-by: Martin Schwenke <martin@meltin.net>
This variable points to /etc/ by default.
This distinguishes it from the different variable from wscript, which
points to /etc/ctdb/.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Also fix an unused test to set CTDB_BASE.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Clearly identifies where all state files from scripts should go and
allows that location to be easily changed. This patch should not
change any behaviour (outside of eventscript unit tests, where a
clearer location is now used).
CTDB_VARDIR should no longer be overridden. Continue to set
CTDB_DBDIR and similar to override database location in unit tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Have wscript do path substitution.
No need to export this and CTDB_ETCDIR here, but test scripts will
still need to do so.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Using $CTDB_VARDIR at file scope is dangerous because it doesn't
respect the configuration. Uses of these variables are simple so just
drop the variables and use $CTDB_VARDIR inside functions where it is
safe.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
This directory has been under $CTDB_VARDIR for a long time, so is
already removed. The command is also potentially dangerous if
$ctdb_managed_dir is not set.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
They're not used outside of ctdb_standard_event_helper().
As a consequence, make ctdb_standard_event_helper() do nothing. It is
harder to remove because it is used in many places, perhaps by
external eventscripts where it has been copied from existing scripts.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
To get a similar effect just do something like this:
mmaddcallback ctdb-disable-on-quorumLoss \
--command /usr/bin/ctdb \
--event quorumLoss --parms "disable"
mmaddcallback ctdb-enable-on-quorumReached \
--command /usr/bin/ctdb \
--event quorumReached --parms "enable"
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
60.nfs sets this to a directory where the NFS callout can store state.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Sep 14 08:36:34 CEST 2015 on sn-devel-104
Always check filesystem usage for the database directories.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Sat Aug 29 20:08:48 CEST 2015 on sn-devel-104
CTDB should warn by default if too much system memory or swap is used.
The tests have also been tweaked. In particular, the filesystem-only
tests need to initialise the memory information to avoid errors where
meminfo isn't set.
Document the defaults, warning against disabling them.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
They are only printed when the percentage usage changes. This should
stop the logs from being filled with warnings.
Add a test for the throttling.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Marking the node unhealthy should cause Samba processes to close,
possible freeing a stack of memory. If not, then it is somebody
else's problem.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
New variables CTDB_MONITOR_MEMORY_USAGE and CTDB_MONITOR_SWAP_USAGE.
Both take a pair of <warn_threshold>:<unhealthy_threshold> where each
theshold is specified as a percentage.
This adds a callout to check_thresholds() that is run when the
unhealthy threshold is reached.
Add some combination tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
CTDB_MONITOR_FREE_MEMORY and CTDB_MONITOR_FREE_MEMORY_WARN are now
percentages that specify thresholds of acceptable memory usage.
Memory/swap usage in tests also specified as percentages.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
No need to use 2 different sources of information for similar checks.
Also, output of free has been changed, whereas /proc/meminfo is a
kernel API, which will not change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This allows both errors (i.e. unhealthy) and warnings for different
thresholds. It replaces CTDB_CHECK_FS_USE.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Drop obvious comments. Use die() for less lines of code. Use a case
statement to avoid forking unnecessary processes for each filesystem
being checked. Drop parentheses around percentages in messages.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Will put all the system monitoring in here, simplifying 00.ctdb.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
We don't expect to maintain an up-to-date copy. NFS Ganesha team
might provide patches.
Also move the Ganesha .check file
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This adds new configuration variable CTDB_RPCINFO_LOCALHOST6.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is an optimisation to avoid forking the callout for operations
that are not implemented.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Unhealthy after 1 failed attempt to contact the portmapper.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Now that there is only a single NFS eventscript, other eventscripts no
longer need to load all of this.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is now handled by passing the desired number of threads to the
command specified in the dump_stuck_threads variable in .check files.
Remove unused function nfs_dump_some_threads().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This isn't a straightforward move of code from 60.ganesha to the
callout. Simplifications have been made to allow better
interoperation with the new NFS checking logic.
The following configuration variables have been removed:
CTDB_GANESHA_REC_SUBDIR
Edit NFS ganesha callout to change this location
CTDB_NFS_SERVER_MODE, NFS_SERVER_MODE
Use CTDB_NFS_CALLOUT instead
CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK, CTDB_SKIP_GANESHA_NFSD_CHECK
Disable the corresponding .check file instead
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
$service_check_cmd specifies a command to run instead of the regular
rpcinfo-based check.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Note that the 60.ganesha RPC checks need to be identical to those in
the nfs-checks.d/ directory. This is because the NFS unit test
infrastructure checks output against what should be produced by the
checks in nfs-checks.d/. This is a minor issue, since one of the aims
of this work is to remove the need for a separate 60.ganesha.
In most cases configuration variable CTDB_NFS_DUMP_STUCK_THREADS is
now ignored. This is now handled by passing the desired number of
threads to the command specified in the service_debug_cmd variable in
a .check file.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Provides a new extensible format for .check files, using simple
variables instead of the unwieldy extended test(1) syntax now used.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Change status, nlockmgr, mountd, rquotad to be unhealthy after 6
rpcinfo check failures and do a verbose restart after every 2
failures. Change 60.ganesha for consistency, since 60.ganesha tests
are broken and depend on the consistency.
Apart from the consistency aspect, the check infrastructure will soon
be simplified so that it only allows the equivalent of "unhealthy" and
"verbose restart:b" actions.
Update tests to have a corresponding numbers of iterations. Run 1
extra iteration in most tests to check there are no unexpected
behaviour changes after the designated number of iterations completes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
VLAN interfaces on bonds with a name other than <iface>.<id>@<iface>
are not currently supported. That is, where the VLAN name isn't based
on the underlying bond name. Such VLAN interfaces can be created with
the "ip link" command, as opposed to the "vconfig" command, or by
renaming a VLAN interface.
This is improved by determining the underlying interface name for a
VLAN from the output of "ip link".
No serious attempt is made to support VLANs with '@' in their name,
although this seems to be legal. Why would you do that?
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Commit 6471541d6d broke support for VLAN
interfaces. Releasing a public IP address depends on
ip_maskbits_iface() and for a VLAN interface this will return an
interface of the form <vlan>@<iface>, which can't be fed back into
"ip" commands.
Update ip_maskbits_iface() to drop the '@' and everything after it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reported-by: Jan Schwaratzki <jschwaratzki@ddn.com>
If statd-callout is unwanted, so is removed, then this code fails.
Change to an "if" so that it succeeds as intended.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
On IPv4-only or IPv6-only systems one of these files will not exist.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This will handle the most obvious cases. It won't handle the case
where the directory is missing and the recovery lock location is
updated at run-time. However, this is a good improvement.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
While 'which' is a very common tool, on many distros it is not a requirement
that it be installed. 'type' is a shell built-in specified by the Open Group,
and is found in shells like bash, dash, and ksh across multiple OSes.
Signed-off-by: Jose A. Rivera <jarrpa@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Fri Jun 5 20:39:47 CEST 2015 on sn-devel-104
This is an alternative to 10.interface and is installed as disabled by
default. It should only be used with DisableIPFailover=yes and when
IP failover is being handled externally. In this mode CTDB can be
informed of public IP address movements using "ctdb moveip".
During the "startup" event, this eventscript currently finds any
public IP addresses configured in $CTDB_PUBLIC_ADDRESSES and tells
CTDB which node they are on using "ctdb moveip". This allows CTDB to
send ARPs and tickle-ACKs.
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
00.ctdb should not know about public IP addresses.
Move related tests to operate on 10.interface.
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This isn't used or documented anywhere.
2 differing points of view:
* This is a very good idea but it should probably be generalised to
cover more configuration items. This would end up like the Samba
registry configuration and would use a tool to support setting
configuration values.
* If people really want to update configuration while a node is down
then they should fix the configuration before bringing up that node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
For example, adding a file called nfs-rpc-checks.d/20.nfsd@udp.check
will cause NFS to be checked on UDP as well, using a separate counter.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Apr 30 09:24:12 CEST 2015 on sn-devel-104
If tdb database file size grows beyond 4GB, tdbtool/tdbdump can hang
indefinitely. This will prevent CTDB from starting up.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Required when automatic address detection can not be used. This can
be the case when running multiple ctdbd daemons/nodes on the same
physical host (usually for testing), using InfiniBand for the private
network or on Linux when sysctl net.ipv4.ip_nonlocal_bind=1.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Apr 27 06:10:08 CEST 2015 on sn-devel-104
"ctdb xpnn" does not work when sysctl net.ipv4.ip_nonlocal_bind=1,
since it determines the node by attempting to bind to each addres in
the nodes file. The solution is to not use "ctdb xpnn". After the
initial call, ctdb_get_pnn() will be more efficient that "ctdb xpnn".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This avoids the expense of establishing a client connection to the
daemon just to get the PNN of the current node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It's possible for a RPC service to register only for UDP and not TCP.
Since we assume all the NFS operations are over TCP, always check RPC
services over TCP.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
$RPCMOUNTDOPTS is ignored when restarting rpc.statd due to the service
being unresponsive. This variable can be used to increase the number
of rpc.mountd threads when there are a lot of clients reattaching so
ignoring it can mean that only a single rpc.mount thread is started.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
When executing a shell script code "foo | bar", if "bar" terminates early,
then "foo" can get I/O error when writing to stdout.
The tdbtool stub did not wait to read anything from stdin when it is
expected to. This would cause tests to fail randomly under load when
tdbtool process exited early.
Similarly, debug function read from stdin only under certain conditions
(higher debug and when not reading from tty). Otherwise, exited early.
Thanks to Andrew Bartlett for noticing the problem and Catalyst Cloud
(http://catalyst.net.nz/cloud) for providing resources to test fixes.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Fri Mar 20 16:26:37 CET 2015 on sn-devel-104
IP addresses and routes are only changed if either the NAT gateway
configuration or the NAT gateway master node has changed. If running
"ip monitor" this will minimise the amount of noise seen. It should
also be more lightweight at the expense of managing a couple of state
files.
Add a test to check that configuration changes behave correctly.
Tweak the static route result generation code so that the required
output is sorted.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Updating ctdb.tdb on each add-client, del-client and each delete
during notify was too ambitious. Persistent transactions do not
perform well enough to do this.
Revert to having add-client and del-client create touch files. Each
monitor event calls "statd-callout update" to convert touch files into
ctdb.tdb records.
Update testcases to do the "update" and add an extra test.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Commit 4638010abb changed from using
gensub() to gsub() in awk. However, it didn't halve the number of
backslashes in the target strings. This is necessary because
backslash is used in gensub() target strings to allow substitution of
text matching parenthesised subexpressions. This is not the case with
gsub().
So, halve the number of backslashes in the target string where gsub()
is used in statd-callout. This is the only target string broken by
changes made by the above commit
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
All tunables set in configuration are currently set to 0 on system
where /bin/sh is dash (and perhaps other non-bash shells). dash puts
single quotes around all values in the output of the "set" builtin
command, whereas bash only puts them around values when something
needs to be quoted. Tunables always have a simple integer value so
dash will quote them and bash won't. The setup code currently passes
the raw value, including any quotes to "ctdb setvar ...". This
command does no error checking on the input, so "'1'" is converted to
0.
Change the code so that the value is determined from the shell
variable and is independent of the "set" output.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This was true for the daemon until commit
b4589b954e.
Defaulting to ERR in the ctdb CLI tool encourages logging notices at
ERR level, so default to NOTICE instead.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Drops the iptables() and ip6tables() functions and, hence, the
hardcoding of paths /sbin/iptables and /sbin/ip6tables. The latter
avoids problems on openSUSE where (for example) /usr/sbin/iptables is
used instead.
This means that locking around ip*tables commands is only done when
iptables_wrapper is called directly. This is fine because the only
conflict is when "releaseip" or "takeip"/"updateip" events are run in
parallel. The other uses in 11.natgw and 70.iscsi are in events where
there will be no collisions.
Making 11.natgw support IPv6 is unnecessary. Just put a static IPv6
address on each interface - they're plentiful.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Jan 28 08:29:55 CET 2015 on sn-devel-104
Block iSCSI port for families of all address the node is configured to
host.
Could just unconditional add blocking using ip6tables instead.
However, this would produce errors when no IPv6 public addresses are
configured and ip6tables is not installed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is a gawk extension and can't be used reliably if just running
"awk". It is simple enough to switch to using the standard sub() and
gsub() functions.
The alternative is to switch to explicitly running "gawk". However,
although the eventscripts aren't exactly portable, it is probably
better to move closer to portability than further away.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
Falling back to running the initscript doesn't work because it detects
that upstart is being used and fails. This was observed when trying
to start winbind on Ubuntu 11.04.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11007
Signed-off-by: Oleksandr Chumachenko <ledest@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
There are a few potential failure modes when adding an IPv6 address.
It takes a little while of duplicate address detection to complete, so
wait for a while. After a timeout, also need to check to see if
duplicate address detection failed - if it did then actually drop the
IP address.
This really needs some careful thinking. If CTDB disappears on a node
but the node's IP addresses are still on interfaces then the above
failure mode could cause the takeover nodes to become banned.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Add checking to "releaseip" and "updateip" to ensure that the given IP
address is really on the given interface with the given netmask. If
reality doesn't match the given arguments then believe reality.
Use new function iptables_wrapper() instead of calling iptables()
directly.
Use new function flush_route_cache() instead of doing IPv4-specific
/proc magic.
Remove setting of otherwise unused variable "failed".
Fix a test for which the error message has changed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
ip6tables() uses the same lock as iptables(). This is done on
suspicion.
iptables_wrapper() takes 1st argument "inet" or "inet6", and the rest
is passed to the correct iptables variant.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It also prints a third word, the address family. This is either
"inet" or "inet6".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Also update associated eventscript unit tests and ctdb stub.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There are parentheses missing that stop the default pattern from
matching commands with trailing garbage (e.g. "exportfs.orig").
A careful check of POSIX (and running GNU sed with --posix) suggests
that "\|" isn't a supported way of specifying alternation in a regular
expression. Therefore, it is clearer to switch to extended regular
expressions so that this has a chance of being portable (even though
the point is to print /proc/<pid>/stack, which only works on Linux).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Nov 18 06:37:45 CET 2014 on sn-devel-104
Also add and update tests for statd stack dumps. Update the existing
60.ganesha statd test to do more iterations. Duplicate the result as
a new test for 60.nfs.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
In the process, fix a bug where an extra trace would be printed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Some implementations may not understand RC3164 format messages on the
UDP socket, so add support for RFC5424 message format.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This has most of the advantages of the old logd with none of the
complexity of the extra process. There are several good syslog
implementations that can listen on the UDP port.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Remove --logfile and --syslog daemon options and replace with
--logging.
Modularise and clean up logging initialisation code. The
initialisation API includes an app_name argument that is currently
unused - this will be used in extensions to the syslog backend.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
As far as we know, nobody uses this and it just complicates the
logging subsystem.
Remove all ringbuffer code and documentation. Update the local
daemons startup code correspondingly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
The major and minor device numbers are hexadecimal not decimal.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Sep 25 07:19:59 CEST 2014 on sn-devel-104
Variables that are not set but exported, may return an empty string
for getenv(). Tested on freebsd.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Sep 17 09:55:47 CEST 2014 on sn-devel-104
The current check is incorrect in 2 ways:
* Commit be71a84565 contained a thinko
that stops virtio_net interfaces from simply being marked up
* virtio_net interfaces can actually be down
virtio_net has supported ethtool since Linux 2.6.29, so just remove
the special case. This means that testing CTDB on very old virtual
machines is not supported.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 31 13:08:47 CEST 2014 on sn-devel-104
This was used to limit damage in the "recovered" event.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Jul 29 10:03:16 CEST 2014 on sn-devel-104
This event was introduced to handle misconfiguration. For example,
where all nodes where configured as NAT gateway slaves.
However, this event can fail when there are performance issues and
capabilities can't be retrieved from a remote node. The problem is
most likely with the remote node, so marking the local node UNHEALTHY
is probably a mistake.
Having a NAT gateway master node only matters in "ipreallocated", so
leave it to do the checking. Given that a node will run
"ipreallocated" as part of the first recovery, this should cause
misconfigurations to be detected nice and early.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Need to be able to recognise a RHEL system. Still use "system" to
start and stop service, since that still works and yields the smallest
change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Debugging can still be running when a monitor event times out and
scriptstatus output changes.
When debugging a hung script to a log file, write to a temporary file
and move the temporary file over the log file when done. The test
then waits for the log file to appear.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 3 08:19:23 CEST 2014 on sn-devel-104
There shouldn't be an early exit for the "init" event. Just make the
"ctdb scriptstatus" call conditional.
While here, move the comment about only running a single instance to
be near locking code. The comment is more useful there.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Check that the $GANRECDIR symlink points to the location specified by
$CTDB_GANESHA_REC_SUBDIR and replace it if incorrect. This handles
reconfiguration and filesystem changes.
While touching this code:
* Create the $GANRECDIR link as a separate step if it doesn't exist.
This means there is only 1 place where the link is created.
* Change some variables names to the style used for local function
variables.
* Remove some "ln failed" error messages. ln failures will be logged
anyway.
* Add -v to various mkdir/rm/ln commands so that these actions are
logged when they actually do something.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jun 20 05:40:16 CEST 2014 on sn-devel-104
Backup and restore of the cluster filesystem can upset the operation
of 60.ganesha by changing the contents of this subdirectory.
Allow this subdirectory to be configured to a subdirectory that is
ignored by backup and restore processes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Jun 11 09:29:22 CEST 2014 on sn-devel-104
The range
CTDB_PER_IP_ROUTING_TABLE_ID_LOW..CTDB_PER_IP_ROUTING_TABLE_ID_HIGH
should not include 253-255. Otherwise policy routing may overwrite
the default system routing tables.
Add some corresponding tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is the loop variable. It can't be empty, especially given the
way the list is built. This must have survived from an earlier
version of the script.
Given that there are whitespace changes associated with the above,
clean-up the "virtio_net" avoidance check so that it reads less like
line-noise.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Commit 4ee4925d41 forgot about
CTDB_NATGW_SLAVE_ONLY so it introduces an incorrect failure when this
is set, and CTDB_NATGW_PUBLIC_IFACE or CTDB_NATGW_PUBLIC_IP is unset.
Relax the sanity check to see if CTDB_NATGW_SLAVE_ONLY is set.
Update the documentation to explicitly state that
CTDB_NATGW_PUBLIC_IFACE and CTDB_NATGW_PUBLIC_IP are optional and
unused if CTDB_NATGW_SLAVE_ONLY is set. It would be possible to
insist that CTDB_NATGW_PUBLIC_IFACE and CTDB_NATGW_PUBLIC_IFACE should
be unset in that case. However, it is more reasonable to allow
consistent configuration across nodes except with some nodes
configured slave-only.
Add tests, update infrastructure and fix a thinko in the stub's
"natgwlist" implementation.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Mon Apr 14 06:06:49 CEST 2014 on sn-devel-104
Extend CTDB_NATGW_STATIC_ROUTES so that each network can have an
optional gateway that overrides CTDB_NATGW_DEFAULT_GATEWAY.
Signed-off-by: Martin Schwenke <martin@meltin.net>
This has been implied since the command to add the route has had
errors redirected to /dev/null. If infrastucture (e.g. ADS, DNS) is
on the same network as CTDB_NATGW_PUBLIC_IP then no route is
necessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Although the dots in $CTDB_NATGW_PUBLIC_IP could probably only help
match an invalid public IP address, this is only executed once so do
as exact a check as possible.
Use CTDB_BASE instead of hardcoding /etc/ctdb.
Make the error message less redundant.
Signed-off-by: Martin Schwenke <martin@meltin.net>