IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This is done by 10.interace where the monitor event fails when there
is a missing interface. The in-daemon interface checking adds no
value.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is like CTDB_CONTROL_GET_NODEMAP but it loads from the nodes file
instead of the daemon.
Also new client function ctdb_ctrl_getnodesfile()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Every time a nodemap is contructed the node IP addresses all need to
be parsed. This isn't very productive use of CPU.
Instead, parse each string once when the nodes file is loaded. This
results in much simpler code.
This code also removes the use of ctdb_address. Duplicating the port
is pointless without an abstraction layer around ctdb_address. If
CTDB gets an incompatible transport in the future then add an
abstraction layer.
Note that the infiniband code is not updated. Compilation of the
infiniband code is already broken. Fixing it will be a separate,
properly tested effort.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
This is currently set in 2 places. One of them makes the node loading
code difficult to refactor. Also, when the surrounding code in either
place is touched then it might get broken.
This only needs to be done once at startup, not on every reload. So
do it once in a very obvious way, sacrificing a few CPU cycles for
some added clarity.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Each node reload unnecessarily and incorrectly resets the VNN map,
causing a potentially unnecessary recovery. When nodes are reloaded
any newly deleted nodes should already be disconnected and any newly
added nodes should also be disconnected. This means that reloading
the nodes file should not cause a change in the VNN map.
The current implementation also leaks memory every time the nodes are
reloaded.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It is much simpler for most cases to have a syslog backend that
doesn't need a separate CTDB-specific logging daemon. This loses the
lossy, non-blocking mode provided by logd. However, a corresponding
feature with a completely different implemention (not requiring an
extra daemon) will be re-added into the syslog backend. In an ideal
world the new implementation would be added first but unfortunately
that is hard to do because the logd code is hooked in at more than one
place.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Deferred calls should not be treated as pending calls since they are
re-processed from the beginning.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This makes it consistent with Samba, to ease transition.
Update unit test code to link to with tdb_wrap instead of including
db_wrap.c.
There are some potential whitespace fixes in this commit that have
been ignored. CTDB's lib/tdb_wrap will be deleted after the
transition to Samba's lib/tdb_wrap, so there's no point polishing it
too much.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To avoid warnings when using --enable-developer, which uses
-Wmissing-prototypes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This duplicates ctdb->ctdbd_pid.
Thanks to Sumit Bose <sbose@redhat.com> for the suggestion.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If something unexpectedly uses fork() then an exiting child will
remove the PID file while the main daemon is still running. The real
test is whether the current process has the PID of the main CTDB
daemon, which is the process that calls setsid().
This could be done using getpgrp() instead. At the moment the
eventscript handler harmlessly calls setpgid() - harmless because the
atexit() handlers are cleared upon exec(). However, it is possible
that process groups will be used more in future so it is probably
better to rely on the session ID.
Thanks to Sumit Bose <sbose@redhat.com> for the idea.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Currently ctdbd_wrapper depends on the session ID. Very soon PID file
removal will too. :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This function does not block signals, but ignores them.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
At the moment ctdb_check_healthy() is overloaded to wait until the
first recovery is complete, handle the "startup" event and also
actually handle monitoring. This is untidy and hard to follow.
Instead, have the daemon explicitly wait for 1st recovery after the
"setup" event. When first recovery is complete, schedule a function
to handle the "startup" event. When the "startup" event succeeds then
explicitly enable monitoring.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This was added to support external monitoring using CTDB event scripts.
However, it was never used.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
No need to pass it as an extra argument to ctdb_start_daemon.
Also ensure options.public_address_list gets a nice static default.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a3d63a9db89d08bb284b3b3a6db773422f21b477)
This removes data types and structure elements related to TRANS2
persistent transaction code.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 22a253b7ccf1ff854cddf0b67969dc84d7d6a654)
Register print_exit_message() earlier so that it covers most of the
early exits.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 90d792cf28d6a823141e4c417b6978f02a9cf596)
Don't blindly remove the socket.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3dd5b925dcf0e9a5b877638e471c5ecf36b46c58)
This is slightly easier to read because it all fits on 1 line.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 035bf3eecf99337c84d4ad16cdbf297b1fa037db)
The "init" event only really fails in the scripts, which should log
something useful on failure. Therefore, a core dump isn't terribly
useful and sometimes attracts unwanted attention.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3af2d833b63af9931792106db71797f3692669a8)
The runstate can't be set to SHUTDOWN twice, so the current naive code
causes a panic on the 2nd shutdown. This regression was introduced in
commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit f1b7ca8dc3f34a59c7b3e55748f974ac9ed8f458)
It should run before:
* the transport is started;
* databases are attached; and
* processing configuration files (e.g. nodes, public_addresses).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 0a0c8543f167e11b75a622513367b083e42cbd3f)
The "setup" event can fail when one of the eventscripts fails to run
its "setup" event. If this occurs then the eventscript should log an
error. The stack trace and core file generated when we abort provides
no useful information.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c50eca6fbf49a6c7bf50905334704f8d2d3237d7)
This adds more serialisation to the startup, ensuring that the
"startup" event runs after everything to do with the first recovery
(including the "recovered" event).
Given that it now takes longer to get to the "startup" state, the
initscript needs to wait until ctdbd gets to "first_recovery".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit f43fe3a560d5915c1a9893256f4e7bfe3d7e290a)
This deconstructs ctdb_start_transport(), which did much more than
starting the transport.
This removes a very unlikely race and adds some clarity. The setup
event is supposed to set the tunables before the first recovery.
However, there was nothing stopping the first recovery from starting
before the setup event had completed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit c31feb27dcdb748b5333321c85fe54852dfa1bcf)
This allows states, including startup and shutdown states, to be
clearly tracked. This doesn't include regular runtime "states", which
are handled by node flags.
Introduce new functions ctdb_set_runstate(), runstate_to_string() and
runstate_from_string().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28)
Otherwise the messages are in a stupid order... :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reported-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit cd87ba85fc6c375758c7d3dfa8dbd4d8a02074b0)
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit c7eab97c7a939710b73aae2d75b404b235a998f5)
This fixes the problem of "ctdb statisticsreset" clearing the number of
clients even when there are active clients.
Values returned in statistics for frozen, recovering, memory_used are based on
the current state of CTDB and are not maintained as statistics. This should
include num_clients as well.
Currently ctdb->num_clients is unused. So use that to track the number of
clients and fill in statistics field only when requested.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit dc4ca816630ed44b419108da53421331243fb8c7)
Unexpected removal of this file can have serious consequences, so it
is best if this is logged at the default level.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bfed6a8d1771db3401d12b819204736c33acb312)
Default is not to create a pid file.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 996e74d3db0c50f91b320af8ab7c43ea6b1136af)
When CTDB is busy with lots of smbd, CTDB was spending too much time in
daemon_check_srvids() which searches a list of srvids in the registered
message handlers. Using a hash based index significantly improves the
performance of search in a linked list.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 3e09f25d419635f6dd679b48fa65370f7860be7d)
Some subprocesses print "CTDB daemon shutting down" when they exit and
this can be confusing.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit f1ffe1112b7e342d7f1228ca816a8e5918f893cf)
ctdb_start_transport() is called just before "setup" event, when CTDB
is ready to process the requests. "startup" event happens much later
after a successful recovery.
Transport method ctdb->methods is successfully initialized before
ctdb_start_transport() is called. No need to check again.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 9a70a4d23d00f6cb996c061ba3dfb7c47b4f6a4f)
Currently flags are initialised in 2 places. One of them is in
ctdb_tcp_listen_automatic(), which just seems wrong. This makes the
code easier to follow by just doing it in ctdb_start_daemon().
This means that the flags are now initialised later than previously.
However, it is still done before the transport is started and before
clients can connect.
In future it might make sense to do a similar thing with setting the
PNN. However, the current optimisation is reasonably obvious...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 2bbee8ac23ad5b7adf7122d8c91d5f0d54582507)
Reimplement 5aba53e6adcfcd7edbdac9e30aa5fcba176aca00 using tevent
trace points.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 98e1b46adba11b9549b5c5976e1f561fe732fa6e)
Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned.
Capture SIGCHLD to track also which child processes have terminated.
Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a
(This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)
and also update the "read public address file" to not check if the address exists already locally when we read if from the child process, to stop it
from spamming the logs with "We already host ..."
messages
(This used to be ctdb commit 334ea830f1bf33419f4a1e78f23afd41a852d0f4)
Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster.
Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy.
(This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0)
Everytime we give a delegation to another node we count this as one delegation.
If the same record is delegated to several nodes we count one for each node.
Everytime a record has all its delegations revoked we count this as one revoke.
(This used to be ctdb commit b098bcf8007be63889aaed640a951b0eeaa9d191)
We dont strictly need to force clients to use CTDB_FETCH_WITH_HEADER instead of CTDB_FETCH when they ask for readonly records.
Have ctdbd internally remap this internally to FETCH_WITH_HEADER and map the reply back to CTDB_FETCH_FUNC or CTDB_FETCH_WITH_HEADER_FUNC based on what the client initially asked for.
This removes the need for the client to know about the CTDB_FETCH_WITH_HEADER_FUNC function and simplifies the client code.
Clients that do not care what the header after the request is can just continue using the old CTDB_FETCH_FUNC call and ctdbd will do all the difficult stuff.
(This used to be ctdb commit 444a7bac4e9a854b06c1ad4cb36c2b58a72001fa)
server/ctdbd.c: In function ‘main’:
server/ctdb_daemon.c:943:7: warning: zero-length gnu_printf format string [-Wformat-zero-length]
(This used to be ctdb commit e6d1dd3ec4a078e5f32bc52a4a9e4b7d9a2e2d16)
When multiple clients fetch the same record concurrently, send only one single
fetch across the network and deferr all other fetches locally.
This improves performance for hot records and reduces cpu load on ctdb.
(This used to be ctdb commit 82d6946ad8b3348e8b9d3d971f24925ade02d1be)
so we need a "ticker" in the main ctdbd daemon too to ensure we get at least one event to process every second.
This will improve the accuracy of "Time jumped" messages and remove false positives when the recovery daemon is "slow".
(This used to be ctdb commit 70154e5e19e219de086b2995d41e8f6e069ee20d)
Revert this patch:
commit 482c302d46e2162d0cf552f8456bc49573ae729d
We may need to use real-time processes for the main daemon and the recovery daemon to handle the cases where systems come under very high loads.
(This used to be ctdb commit 08bef9dcab6e4da15fc783f8624e5ed09aa060b5)
During this window we may still hit asynchronous events that will fail because we can not send/receive packets from other nodes.
These messages are logged as ... Transport is DOWN. To help indicate that they are benign messages related to the process of shutting down.
These messages spam the syslog during normal shutdown, so this patch will drop the loglevel of these messages to DEBUG, so that they will not appear in or spam the syslog.
(This used to be ctdb commit 8275d265d2ae19b765e30ecf18f6b6319b6e6453)
Add a new command "ctdb stats [num]" that prints the [num] most recent statistics intervals collected.
(This used to be ctdb commit e6e16fcd5a45ebd3739a8160c8fb5f44494edb9e)
In Samba this is now called "tevent", and while we use the backwards
compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now
a separate tevent_fd_set_auto_close() function.
This is based on Samba version 7f29f817fa.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)
We've been seeing "Invalid packet of length 0" errors, but we don't know
what is sending them. Add a name for each queue, and print nread.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit e6cf0e8f14f4263fbd8b995418909199924827e9)
ctdb_client.h is the existing internal client interface (which was mainly
in ctdb.h), and ctdb_protocol.h is the information needed for the wire
protocol only.
ctdb.h will be the new, shiny, libctdb API.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 4bba6b8cd47b352f98d41f9f06258d5ac3c9adef)
We can't just drop packets to the list, as those packets could be part
of the core protocol the client is using. This happens (for example)
when Samba is doing a traverse. If we drop a traverse packet then
Samba hangs indefinately. We are better off dropping the ctdb socket
to Samba.
(This used to be ctdb commit a7a86dafa4d88a6bbc6a71b77ed79a178fd802a6)
This is needed because the "startup" event runs after the initial recovery,
but we need to do some actions before the initial recovery.
metze
(This used to be ctdb commit e953808449c102258abb6cba6f4abf486dda3b82)
We don't want ctdb stalling due to paging; this can be far worse than
scheduling delays. But if we simply do mlockall(MCL_FUTURE), it
increases the risk that mmap (ie. tdb open) or malloc will fail,
causing us to abort.
This patch is a compromise: we mlock all current pages (including
10k of future stack for expansion) and then relock when a client
asks us to open a TDB. We warn, but don't exit, if it fails.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 82f778e85440bc713d3f87c08ddc955d3cfce926)
1) It's buggy. Code needs to be carefully written (ie. no busy
loops) to handle running with it, and we fork and run scripts.[1]
2) It makes debugging harder. If ctdbd loops (as has happened recently)
it can be extremely hard to get in and see what's happening. We've already
seen the valgrind hacks.
3) We have seen recent scheduler problems. Perhaps they are unrelated,
but removing this very unusual setup is unlikely to hurt.
4) It doesn't make anything faster. Under all but the most perverse of
circumstances, 99% of the cpu gives the same performance as 100%, and
we will always preempt normal processes anyway.
[1] I made this worse in 0fafdcb8d353 "eventscript: fork() a child for
each script" by removing the switch_from_server_to_client() which
restored it, but even that was only for monitor scripts. Others were
run with RT priority.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 482c302d46e2162d0cf552f8456bc49573ae729d)
Depending on --max-persistent-check-errors we allow ctdb
to start with unhealthy persistent databases.
The default is 0 which means to reject a startup with
unhealthy dbs.
The health of the persistent databases is checked after each
recovery. Node monitoring and the "startup" is deferred
until all persistent databases are healthy.
Databases can become healthy automaticly by a completely
HEALTHY node joining the cluster. Or by an administrator
with "ctdb backupdb/restoredb" or "ctdb wipedb".
metze
(This used to be ctdb commit 15f133d5150ed1badb4fef7d644f10cd08a25cb5)
This might be a bit less efficient, but experience in winbind has shown that
event callbacks can trigger changes in the socket state in very hard to
diagnose ways.
(This used to be ctdb commit a78b8ea7168e5fdb2d62379ad3112008b2748576)
This controls is only used by samba when samba wants to check if a subrecord held by a <node-id>:<smbd-pid> is still valid or if it can be reclaimed.
If the node is banned or stopped, we kill the smbd process and return that the process does not exist to the caller. This allows us to recover subrecords from stopped/banned nodes where smbd is hung waiting for the databases to thaw.
bz58185
(This used to be ctdb commit 157807af72ed4f7314afbc9c19756f9787b92c15)
Add the mapping to the list everytime we accept() a new client connection
and set it up to remove in the destructor when the client structure is freed.
(This used to be ctdb commit f75d379377f5d4abbff2576ddc5d58d91dc53bf4)
and store this in the client structure.
There is no need to rely on the hack that samba sends some special message
handle registrations that encodes the pid in the srvid any more.
This might not work on AIX since I recall some issues to get the pid in
this way on that platform.
(This used to be ctdb commit b4a7efa7e53e060a91dea0e8e57b116e2aeacebf)
add a global variable holding the pid of the main daemon.
change the tracking of time() in the event loop to only check/warn when called from the main daemon
(This used to be ctdb commit a10fc51f4c30e85ada6d4b7347b0f9a8ebc76637)
The way to use this is from a client to :
1, first create a message handle and bind it to a SRVID
A special prefix for the srvid space has been set aside for samba :
Only samba is allowed to use srvid's with the top 32 bits set like this.
The lower 32 bits are for samba to use internally.
2, register a "notification" using the new control :
CTDB_CONTROL_REGISTER_NOTIFY = 114,
This control takes as indata a structure like this :
struct ctdb_client_notify_register {
uint64_t srvid;
uint32_t len;
uint8_t notify_data[1];
};
srvid is the srvid used in the space set aside above.
len and notify_data is an arbitrary blob.
When notifications are later sent out to all clients, this is the payload of that notification message.
If a client has registered with control 114 and then disconnects from ctdbd, ctdbd will broadcast a message to that srvid to all nodes/listeners in the cluster.
A client can resister itself with as many different srvid's it want, but this is handled through a linked list from the client structure so it mainly designed for "few notifications per client".
3, a client that no longer wants to have a notification set up can deregister using control
CTDB_CONTROL_DEREGISTER_NOTIFY = 115,
which takes this as arguments :
struct ctdb_client_notify_deregister {
uint64_t srvid;
};
When a client deregisters, there will no longer be sent a message to all other clients when this client disconnects from ctdbd.
(This used to be ctdb commit f1b6ee4a55cdca60f93d992f0431d91bf301af2c)
Add a new tunable to control the maximum queue size we allow to a blocked client before we start discarding REQ_MESSAGES instead of queueing them for delivery.
This avoids having queued up very very large number of MESSAGES that samba semds
between eachother to nodes that are blocked/banned/stopped for extended periods
.
(This used to be ctdb commit f76d6fed8f9630450263b9fa4b5fdf3493fb1e11)
so we can spot if there are leaks.
plug two leaks for filedescriptors related to when sending ARP fail
and one leak when we can not parse the local address during tcp connection establish
(This used to be ctdb commit ddd089810a14efe4be6e1ff3eccaa604e4913c9e)
In ctdb_client.c:ctdb_transaction_commit(), after a failed
TRANS2_COMMIT control call (for instance due to the 1-second
being exceeded waiting for a busy node's reply), there is a
1-second gap between the transaction_cancel() and
replay_transaction() calls in which there is no lock on the
persistent db. And due to the lack of global state
indicating that a transaction is in progress in ctdbd, other nodes
may succeed to start transactions on the db in this gap and
even worse work on top of the possibly already pushed changes.
So the data diverges on the several nodes.
This change fixes this by introducing global state for a transaction
commit being active in the ctdb_db_context struct and in a db_id field
in the client so that a client keeps track of _which_ tdb it as
transaction commit running on. These data are set by ctdb upon
entering the trans2_commit control and they are cleared in the
trans2_error or trans2_finished controls. This makes it impossible
to start a nother transaction or migrate a record to a different
node while a transaction is active on a persistent tdb, including
the retry loop.
This approach is dead lock free and still allows recovery process
to be started in the retry-gap between cancel and replay.
Also note, that this solution does not require any change in the
client side.
This was debugged and developed together with
Stefan Metzmacher <metze@samba.org> - thanks!
Michael
(This used to be ctdb commit f88103516e5ad723062fb95fcb07a128f1069d69)
Otherwise there is the chance that we will reset the statistics after the counter has been incremented (client connects) to zero and when the client disconnects we decrement it to a negative number.
this is a pure cosmetic patch with no operational impact to ctdb
(This used to be ctdb commit 72f1c696ee77899f7973878f2568a60d199d4fea)
race between the ctdb tool and the recovery daemon both at once
trying to push flag changes across the cluster.
(This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa)
log the type of operation and the database name for all latencies higher
than a treshold
(This used to be ctdb commit 1d581dcd507e8e13d7ae085ff4d6a9f3e2aaeba5)
we currently only monitor that the dameons are running by kill(0, pid)
and verifying the the domain socket between them is ok.
this is not sufficient since we can have a situation where the recovery
daemon is hung.
this new code monitors that the recovery daemon is operating.
if the recovery hangs, we log this and shut down the main daemon
(This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c)
This allows ctdb to automatically start a new full blown recovery
if a client has started updating the local tdb for a persistent database
but is kill -9ed before it has ensured the update is distributed clusterwide.
(This used to be ctdb commit 1ffccb3e0b3b5bd376c5302304029af393709518)
This reverts commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10.
revert the waitpid changes. we need to waitpid for some childredn so should
refactor the approach completely
(This used to be ctdb commit 702ced6c2fe569c01fe96c60d0f35a7e61506a96)
so we should not call it from the main daemon.
1, set SIGCHLD to SIG_DFL to make sure we ignore this signal
2, get rid of all waitpid() calls
3, change reporting of event script status code from _exit()/waitpid() to write()/read() one byte across the pipe.
(This used to be ctdb commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10)
If we shutdown the transport and CTDB later decides to send a command out
for queueing, the call to ctdb->methods->allocate_pkt() will SEGV.
This could trigger for example when we are in the process of shuttind down CTDBD and have already shutdown the transport but we are still waiting for the
"shutdown" eventscripts to finish.
If the event scripts now take much much longer to execute for some reason, this
race condition becomes much more probable.
Decorate all dereferencing of ctdb->methods-> with a check that ctdb->menthods is non-NULL
(This used to be ctdb commit c4c2c53918da6fb566d6e9cbd6b02e61ae2921e7)
Add support in AIX to track the PID of a client that connects to the unix domain socket
(This used to be ctdb commit 4c006c675d577d4a45f4db2929af6d50bc28dd9e)
and a ctdb command to pull the talloc memory map from a recovery daemon
ctdb rddumpmemory
(This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05)
nodes into two separate files.
move the monitoring of keepalives for detecting connected/disconnected
remote nodes into ctdb_keepalive.c
(This used to be ctdb commit 23a57b20c314d5f11a433cf251eb9d9de743849a)
of the startup event scripts after the point where recovery has
started and the node is in normal operation
This makes the 'startup' script just a special type of the 'monitor'
script which is called first
(This used to be ctdb commit 7424c30a5fd04aea0137c466b4318c3f185280d8)
set the node initially unhealthy and let the status monitoring bring the node online.
This fixes a problem with winbindd, where it refused to start because secrets.tdb was not populated
but we could not populate ctdbd, because the net command would not run while ctdbd was still doing startup
and thus frozen
(This used to be ctdb commit 3a001b793dd76fb96addf1e2ccb74da326fbcfbc)
see both the old flags as well as the new flags (so we can tell which
flags changed)
send the CTDB_SRVID_RECONFIGURE messages to connected nodes only, not to
every node, connected or not, in the cluster.
in the handler inside the recovery daemon which is invoked for node flag
change messages, only do a takeover_run() and redistribute the ip addresses IF it was the
disabled or the unhealthy flags that changed. Also send out the cluster
reconfigured message in this case.
If any of the other flags changed we dont need to do the takeover_run(0
here since that will be done during recovery.
(This used to be ctdb commit 5549b2058e2c148a8ca9d419123acf3247bb8829)