IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When stopping (as opposed to restarting) it is useful to see this
information.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a9ab1937239761dc32b143c9d225447bc6f090b4)
d362be7d32079ac1390d67056ce107bfbca2c937 wasn't well thought out.
Subsequent commits depend on ctdb_counter_init() taking an argument,
so this makes those cases work.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 05a8fcfbac3da2b5843b31e0fe258255cc761190)
* Add an optional service name argument to existing reconfigure
functions.
* User function service_reconfigure() instead of variable
$service_reconfigure to specify how a service is reconfigured.
* New function ctdb_service_check_reconfigure() reconfigures a service
if it is flagged for reconfigure.
* Remove $service_reconfigure settings from 40.vsftpd and 41.httpd -
they're the defaults.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c)
Move flagging of managed or unmanaged services into
ctdb_service_start() and ctdb_service_stop(). That way services will
be correctly flagged if they are started from the startup and shutdown
events.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 8675744cbd90b5a5095ed6fff7b36ae82004a457)
service_start is currently a variable. This makes passing arguments
hard. We change it to be a function and put default definitions into
the functions file.
We use a convention that if a service name argument is passed to a
redefined version of service_start() or service_stop() then it will
act unconditionally. If no argument is passed then it can use
internal logic to decide if services should really be started. This
is useful when a single eventscript handles multiple services.
This is a cherry-pick of ae38895 that needed to be reset mid-stream.
There is still some breakage following this commit.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 86e4aefed9fd1028660c98e3ea758c2b75ffc1d8)
This function generates a lot of trace when running under "set -x".
This is due to the backward compatibility code.
This adds 3 optimisations:
1. Before invoking the backward compatiblity code,
is_ctdb_managed_service() returns early if the service is listed in
$CTDB_MANAGED_SERVICES.
2. ctdb_compat_managed_service() actually now updates
$CTDB_MANAGED_SERVICES instead of temporary variable $t.
This means that a subsequent call to is_ctdb_managed_service() will
short circuit due to optimisation (1).
3. ctdb_compat_managed_service() only adds a service to
$CTDB_MANAGED_SERVICES if it is the service being checked by
is_ctdb_managed_service().
This stops irrelevant services being added to
$CTDB_MANAGED_SERVICES multiple times by multiple calls to
is_ctdb_managed_service().
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 758f4667c60089e09a0439c1eb74f5e426ca5e2e)
To be used by eventscripts to create a per-service directory for their
own state data. $service_state_dir is set to point to the new
directory.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a273554791c2a5281aee28f8e2be0c514e14c91e)
This was done ad hoc and was badly named.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 9a084a121f629b2c1bcefc1e4c4a4a5cacf53987)
Another unit testing hook. This is easier than dropping files into
rc.local.d/ and then removing them.
The file has to be executable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit b13ac3bdaf326a6cdfd87da9195eb9630806c418)
For eventscript unit testing it will be necessary to override external
commands to allow stub implementations to be used. If absolute paths
aren't used then this can be done using either a fake bin/
subdirectory or by using shell functions.
This removes all of the simple cases of absolute paths.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Conflicts:
config/ctdb.init
config/events.d/50.samba
Keep old code but remove absolute paths.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 05851d50b0078de8bf4691442d718825adca6fe8)
These provide a thin layer around writing and reading files in /proc.
They can be easily replaced by stubs for unit testing.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 637f9d8af517b73c72ed8f3cc2a2661f11eb2126)
These haven't been used for a long time.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit f5fd361cadb3ea18d29e2d7215a7853718e48d00)
* $CTDB_ETCDIR defaults to /etc but can be changed for testing. All
hard-coded instances of /etc have been changed to $CTDB_ETCDIR.
This includes references to /etc/init.d and /etc/sysconfig.
* service() and nice_service() functions now call new function
_service(). This makes it easier to override these functions (say,
in rc.local) for testing and call most of the existing functionality
using _service().
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit f43c9a7604b779bb6257ddb2bf3cbe266d496a63)
This will be needed when eventscripts that use it are called
externally.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit ebd53b66b0cc66d9d04830781886234167fc2164)
Otherwise we might short-circuit events that are run only once and
actually need to do something.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c4f9e8a43540bc049b2771e0a2d76d37b9d17331)
Otherwise there can be strange error messages from services
stopping/starting, without any context.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 8bcf7ab164429ddc0ae530133e114f186a8146dd)
"service nfs restart" can fail. To stop nfsd it sends a SIGINT and
nfsd might take a while to process it if the system is loaded.
Starting nfsd may then fail because resources are still in use.
This does some /proc magic to tell nfsd to do no more processing. It
then runs service stop, kills nfsd with SIGKILL, and then runs service
start. This is much less likely to fail.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a9bf4f82852975b0b627f61ceb2d23401f630805)
ctdb_service_start() currently succeeds if ctdb_counter_init()
succeeds.
This changes it to fail when a service start fails.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit ddb73962d72d933bf0edc28be0dbb45bea7e5ef4)
When the value of $CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND (or
corresponding changes are made to $CTDB_MANAGED_VERSIONS), the
associated service should be started or stopped as necessary.
This add calls to ctdb_start_stop_service() to manage
starting/stopping samba and winbind.
An associated cleanup is made to the initial checks that one of
$CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND is set, replacing them
with calls to is_ctdb_managed_service().
To handle the winbind cases ctdb_start_stop_service() and
is_ctdb_managed_service() are updated to take an optional service name
parameter.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit d98f175e8420d921a123ae9c0ce00945350b1537)
update nfs to try to restart the service after 10 consecutive failures
and to flag the node unhealthy after 15
add similar function to mountd
(This used to be ctdb commit 1569a54bb82fc433895ed68f816cf48399ad9d40)
Rename loadconfig() to _loadconfig(). Add a new loadconfig() that
simply calls _loadconfig().
This makes it easy for the test suite to override loadconfig().
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 1d77a3adfff893b3c01b87f791e72c0d3148425c)
These failures are sometimes the result of slow restarts so we want to
avoid dirtying the logs or marking a node unhealthy because of them,
unless they are excessive.
For these 2 cases we use the existing fail counting code but hack a
temporary service_name in a subshell to allow separate fail counts.
We also update ctdb_check_rpc() so that it captures the error output
from rpcinfo and we add a message including the service name to the
beginning. The error is printed to stdout but is also stored in
ctdb_check_rpc_out to allow it to be conditionally used by the caller.
This function also now returns non-zero rather than exiting on
failure.
Other direct rpcinfo calls are relaced by called to ctdb_check_rpc()
for consistency.
Option handling code for service restarts is cleaned up so that fits
in 80 columns. A more informative restart messageis now used in all
cases, printing the exact command being used to start a service.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 79c25fe241cf5d8f92e23d3736823ebaf4e1769d)
since that will usually be /etc/ctdb/state and storing this under /etc is just
wrong.
Add a new variable CTDB_VARDIR that defaults to /var/ctdb and store the data there instead.
(This used to be ctdb commit 516423c25afa9861d9988096efa8a4a2b12b31b1)
This adds a new function update_tickles() that tracks tickles for a
given port using the new ctdb addtickle/deltickle commands. This
function is used in events.d/60.nfs to handle NFS tickles.
events.d/61.nfstickle is removed. The
/proc/sys/net/ipv4/tcp_tw_recycle setup is also moved to
events.d/60.nfs.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit dca4c4ebf3c35f8db3ae208efb7a83abbf726ed6)
sometimes (very rarely) fails to restart the service.
Add a function to restart NFSd on SLES and RHEL-like systems.
If we detect the system is unhealthy due to kNFSd not running,
try to restart the service again "service nfs restart" and
hope for the best.
CQ1019372
(This used to be ctdb commit 25c4ce7e919f13226219f036bcffd2be76b2f06c)
When doing a releaseip event, we do them in parallel for all the separate
IPs. This creates a problem for iptables, which isn't reentrant, giving
the strange message:
iptables encountered unknown error "18446744073709551615" while initializing table "filter"
The worst possible symptom of this is that releaseip won't remove the rule
which prevents us listening to clients during releaseip, and the node will be
healthy but non-responsive.
The simple workaround is to flock-wrap iptables. Better would be to rework
the code so we didn't need to use iptables in these paths.
CQ:S1018353
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 72d6914ee913272312d7b68f1be5ad05ad06587d)
This adds a generic infrastructure to register scripts which will
be called when the delete_ip_from_iface() funtion needs to readd
secondary ips to an interface.
metze
(This used to be ctdb commit ac97d65f44e8dc8bf2ec8f68e4db3448521755a2)
If "$1" was empty than loadconfig would load the ctdb config twice.
This stops that from happening.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 0406d406da70aaee7ad6aac236114905c5d03ed2)
When two releaseip events run in parallel it's possible that the 2nd script
readds a secondary ip that was removed by the 1st script.
metze
(This used to be ctdb commit e02417b2a55c45ac2c125b1b3463c9c39e7bc07a)
The functions file no longer causes a side-effect by doing a shift.
It also doesn't set a convenience variable for $1.
All eventscripts now explicitly use "$1" in their case statement, as
does the initscript. The absence of a shift means that the
takeip/releaseip events now explicitly reference $2-$4 rather than
$1-$3.
New function ctdb_standard_event_handler handles the status and
setstatus events, and exits for either of those events. It is called
via a default case in each eventscript, replacing an explicit status
case where applicable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3d55408cbbb3bb71670b80f3dad5639ea0be5b5b)
Apart from lots of cleanup work, this also fixes a bug where the share
checks didn't used to cope with directory names containing spaces.
The previous commit also loaded the config incorrectly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3c93336ab92c2e4829ff4dc360045bfa6df21d50)
This is the first stage of an experimental change to eventscripts.
Ronnie and I did a few hours of factorisation of 40.vsftpd and applied
many of the changes to 41.httpd. Other eventscripts were also
modified.
At this stage this is completely untested.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1)
use killtcp and kill both directions of the nfs connections.
we used to kill only one direction since the other direction was unkillble
but recent kernels allow us to kill both
(This used to be ctdb commit 8001ae580bcc28d45f6026b529d7ffc247cbba34)
There are 2 problems with this code:
* The loop in ctdb_check_directories_probe() breaks on filenames
containing whitespace.
The fix to protect them is to pass "$@" to this function and have it
operate on "$@".
Note that there's still a problem with whitespace in filenames in
the 50.samba eventscript. To fix this ctdb_check_directories_probe
should read the filenames from stdin. Another time...
* The check for '%' in filenames in ctdb_check_directories_probe()
ends up involving several forks. On a modern machine this can cost
a couple of minutes when checking a large number of directories.
The fix is to use a case statement.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit eb1fecaef9aa5cb85dff7d4f7af8a9878deabed8)
Change the monitor event in 40.vsftpd so it only fails if there are 2
successive failures connecting to port 21. This reduces the
likelihood of unhealthy nodes due to vsftpd being restarted for
reconfiguration due to node failover or system reconfiguration.
New eventscript functions ctdb_counter_init, ctdb_counter_incr,
ctdb_counter_limit. These are used to count arbitrary things in
eventscripts, depending on the eventscript name and a tag that is
passed, and determine if a specified limit has been hit. They're good
for counting failures!
These functions are used in 40.vsftpd and also in 01.reclock - the
latter used to do the counting without these functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896)
Add a helper function that checks whether a unix domain socket exists
and there is a daemon LISTENING to it similar to the existing function
to check for a daemon LISTENING to a tcp/ip socket.
(This used to be ctdb commit 025a836ab3be3c078fccd8c10b10dfffbfdd94d0)
The netstat test only grepped for the ipv4 wildcard address.
Now the ipv6 wildcard listener is correctly detected as well.
Michael
(This used to be ctdb commit 78e7928797e239e71f96eb001460a0dbf943e18f)
This fixes tcp port monitor events on systems, where netcat or nc
is not found in /usr/bin/, Debian, for instance.
The patch also separates the process of finding the binaries and
calling them, moving the detection outside of the loop over the
ports list.
Michael
(This used to be ctdb commit 3adf100e7f0c04aaf2da9ae4c6984cdb708c3b57)
This prevents the monitor action of 50.samba from failing
on e.g. a typical [homes] service with "path = /home/%S" .
Michael
(This used to be ctdb commit 023d6c2e3017d323b5a70f987f3b4e0b8b8f0f7b)
Kevin Collins noticed that RHEL5 grep-2.5.1-54.2.el5 built for
x86 does not handle \s while the exact same RHEL5 package for amd64
does!
[[:space:]] is more portable. Even across the same package version ( different architecture ) from the same vendor :-)
(This used to be ctdb commit fd7bb21c4f9289fc34a57f9d8cb7c13a02d06096)
check for tcp ports
(the check for these tools should not really use hardcoded paths)
(This used to be ctdb commit 56d77082c07a519dd3804cc24cc7ba889b8469ff)
- added monitoring of the ethernet link state
When monitoring detects an error, the node loses its public IP address
(This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501)