IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
We reduce the number of failures before attempting a restart.
However, after 6 failures we mark the cluster unhealthy and no longer
try to restart. If the previous 2 attempts didn't work then there
isn't any use in bogging the system down with an attempted restart on
every monitor event.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit f654739080b40b7ac1b7f998cacc689d3d4e3193)
This adds a helper function called nfs_check_rpc_service() and uses it
to make the monitor event much more readable. An example of usage is
as follows:
nfs_check_rpc_service "mountd" \
-ge 10 "verbose restart:b unhealthy" \
-eq 5 "restart:b"
The first argument to nfs_check_rpc_service() is the name of the RPC
service to be checked. The RPC service corresponding to this command
is checked for availability using the rpcinfo command. If the service
is available then the function succeeds and subsequent arguments are
ignored.
If the rpcinfo check fails then a failure counter for that particular
RPC service is incremented and subsequent arguments are processed in
groups of 3:
1. An integer comparison operator supported by test.
2. An integer failure limit.
3. An action string.
The value of the failure counter is checked using (1) and (2) above.
The first check that succeeds has its action string processed - note
that this explains the somewhat curious reverse ordering of checks.
It the example above:
* If the counter is >= 10 then a verbose message is printed
describing the failure, the service is restarted in the background
and the node is marked as unhealthy (via an "exit 1" from the
function).
* If the counter is == 5 then the service us restarted in the
background.
For more action options please see the code.
This also changes the ctdb_check_rpc() function so that it no longer
takes a program number to check. It now just takes a real RPC program
name that rpcinfo can resolve via /etc/rpc.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e)
This means that it now occurs on every reconfigure event. As a result
the ipreallocated event is removed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c45a89418ba733ff91d48340d72bdb6d2ef80051)
Samba doesn't need to do anything for configuration changes. It will
notice configuration changes and reload automatically.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit de13350c17261032a7468c2cf4d2cf4a8d66a840)
* Reduce the failure counts so that restart attempts happen sooner.
* Use service_start() and service_stop() for the restart.
ctdb_service_start() resets the failure count, which isn't very
useful in this context.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 01776b9f29af9ad5c8534649ece1bd100e450434)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 0f003f05e28037eefdce3a686fcb52cd2289af9d)
The state directory basename becomes "nfs" rather than "statd". One
line of code i moved from the "startup" event to service_start().
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit cc4c5c19af7efe01c48f73bb5ec5e607ed79db4c)
To simplify we also remove the reconfigure from the recovered event
because the monitor event will handle this very quickly anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit da3aedd1a472b430b75989d3c157efedd382e327)
* Add an optional service name argument to existing reconfigure
functions.
* User function service_reconfigure() instead of variable
$service_reconfigure to specify how a service is reconfigured.
* New function ctdb_service_check_reconfigure() reconfigures a service
if it is flagged for reconfigure.
* Remove $service_reconfigure settings from 40.vsftpd and 41.httpd -
they're the defaults.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 15d4111d0761d82f57d5d4f0b1227812d14e4d7c)
service_start is currently a variable. This makes passing arguments
hard. We change it to be a function and put default definitions into
the functions file.
We use a convention that if a service name argument is passed to a
redefined version of service_start() or service_stop() then it will
act unconditionally. If no argument is passed then it can use
internal logic to decide if services should really be started. This
is useful when a single eventscript handles multiple services.
This is a cherry-pick of ae38895 that needed to be reset mid-stream.
There is still some breakage following this commit.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 86e4aefed9fd1028660c98e3ea758c2b75ffc1d8)
Currently it checks $CTDB_MANAGES_WINBIND directly in several places.
This doesn't work when someone sets $CTDB_MANAGED_SERVICES directly.
This modifies check_ctdb_manages_winbind() so that it return a
condition rather than modifying $CTDB_MANAGES_WINBIND. This makes
some code more readable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 538902fbc1e74134a03987b36b3733ad641f8971)
Currently it checks $CTDB_MANAGES_SAMBA directly. This doesn't work
when someone sets $CTDB_MANAGED_SERVICES directly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit d8f0f8948abd340088720718fef7dc858661ba23)
When the value of $CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND (or
corresponding changes are made to $CTDB_MANAGED_VERSIONS), the
associated service should be started or stopped as necessary.
This add calls to ctdb_start_stop_service() to manage
starting/stopping samba and winbind.
An associated cleanup is made to the initial checks that one of
$CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND is set, replacing them
with calls to is_ctdb_managed_service().
To handle the winbind cases ctdb_start_stop_service() and
is_ctdb_managed_service() are updated to take an optional service name
parameter.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Conflicts:
config/events.d/50.samba
Most of this merged elsewhere. This just removes a check that
this is the monitor event.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 257a2e350280c0b76ed2fac588cad167381fda52)
In dash, this fails gracefully with nothing to stderr:
t=$(cat /does_not_exist) 2>/dev/null
In bash the error from cat is still printed due to different order of
evaluation.
This works everywhere:
t=$(cat /does_not_exist 2>/dev/null)
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a6e61867c7a58d5a77cd8641d8df0b105cddff77)
Also remove some unnecessary absolute paths for commands, which were
making the code slightly difficult to read.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 1b3f2dd62efb240f8486016fe0f8dfb73d6ccc66)
This also fixes a bug where update_config_from_tdb() used an incorrect
filename in one place.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a5ce2adaa39f077f56582072a97bb64d0eba4b4d)
Without this you can get into a situation where ctdbd can not start.
If the active file for a service exists but the service is not
running, then trying to stop the service may fail, causing the
eventscript to exit from ctdb_start_stop_service().
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 28379ca0f747c5952d690a451834ce7421adfd34)
POSIX sh doesn't have local variables. Debian's dash doesn't behave
the same way as bash on this contruct:
local var=`command that produces multiple words`
It only assigns the 1st word and may print an error.
Just remove the use of the "local" keyword in monitor_interfaces() to
solve this. It isn't actually limiting the scope of any variables
that are used outside the function.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 95d9a1e19655461288a2c7e52abf9d01ab23e05a)
Call call_proc(), put the output into a variable and then use it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 2dfdc997f432d522034922b43cb6f8f878d11ba7)
For eventscript unit testing it will be necessary to override external
commands to allow stub implementations to be used. If absolute paths
aren't used then this can be done using either a fake bin/
subdirectory or by using shell functions.
This removes all of the simple cases of absolute paths.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Conflicts:
config/ctdb.init
config/events.d/50.samba
Keep old code but remove absolute paths.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 05851d50b0078de8bf4691442d718825adca6fe8)
If the last IP address on an interfaces is removed then that
interfaces should no longer be checked by 10.interfaces. However,
"ctdb ifaces" still lists such interfaces so they are currently
checked.
The problem really needs to be addressed in ctdbd but a neat quick
eventscript fix will be minimally invasive...
This changes the code to use "ctdb -Y ip -v" instead of "ctdb -Y
ifaces". The former includes details of all public addresses and
associated interfaces, so when an address is removed there is no
output for it. This avoids orphaned interfaces from being listed.
The logic is also slightly improved so that $IFACES includes just a
(non-uniquified) list of interfaces, allowing an existing loop to be
removed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 49b2d1bd9554461ed8edbfc21e777c0eca9e1443)
adding/removing IP addresses causing routes might be dropped by the system.
The easiest workaround for this is to unconditionally try to reapply
all static routes for all interfaces once ipreallocation has finished,
not just adding them back on the affected interface.
This worksaround a funky issue in
CQ S1023538
(This used to be ctdb commit 84600d1f53632d5fe76c308727f31f61b5ec1010)
in use by public addresses. this can happen when we have removed existing interfaces/ip addresses and prevents us from verifying the status of other interfaces
(This used to be ctdb commit d67955b42f7627be9dae995230c8fcbb8a948ec2)
script if/when we have for example NATGW configured but no public addresses defined on that interface
CQ S1023378
(This used to be ctdb commit 8837daa424732aeb5a20814b1709c345a97a0e09)
we can not just check if MII Status is up for bonding mode 4, since the kernel will always report the bond device as UP
even if all cables are disconneccted.
For mode 4, ignore the status of the bond device and instead chek if at least one slave interface is up
when determining if the device is good or bad
(This used to be ctdb commit a6930cec6d9503dba18b9d4839d87a1c1a8ddba2)
Simplify the handling of setting the links in the 10.interface eventscript
and remove the optimization to only call setifacelink on state change
to make the code simpler to read.
If a take ip event fails, flag the node as unhealthy.
Add a check to the interface script to check if the interface exists
or if it has been deleted.
So that we can capture and become UNHELTHY if someone deletes an interface
we are using to host public addresses.
(This used to be ctdb commit 4ab63d2a7262aff30d5eced184c294c9c9dd4974)
* continous -> continuous
* activete -> activate
(thanks to lintian)
See https://bugzilla.samba.org/show_bug.cgi?id=6935
Signed-off-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit fb6987c2f747d6dbf9bb3899a480124d1c242a90)