IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
There is no sane way of keeping stdin open when using the shell to
background ctdbd in local_daemons.sh. Instead, have ctdbd fork when
not interactive and when test mode is enabled. become_daemon() can't
be used for this: if it forks then it also closes stdin.
For the interactive case, become_daemon() wasn't doing anything
special, so do nothing instead.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
These don't need to depend on do_fork. Child logging should be set up
whenever the daemon is not interactive. The stdin handler should be
setup whenever test mode is enabled.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
No functional changes.
This is staging for a change that makes ctdbd fork when test mode is
enabled but interactive is not set.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This allows a test environment to simply close its end of a pipe to
cleanly shutdown ctdbd. Like in smbd, this is only done if stdin is a
pipe or a socket.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This avoids a crash if ctdb_shutdown_sequence() is called before
monitoring is initialised.
Switch to using TALLOC_FREE() while touching this function.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Testing against a commonly used cluster filesystem has shown no
performance impact, as expected.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This changes the behaviour for some failures from exiting to simply
attempting to schedule the next run.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
All the vacuum operations if required have an event loop to ensure
completion of pending operations. Once all the steps are complete,
there is no reason to process any more packets.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
The only reason for recoverd attaching to databases was to migrate
records to the local node as part of vacuuming. Recovery daemon does
not take part in database vacuuming any more.
The actual database recovery is handled via the recovery_helper and
recovery daemon should not need to attach to the databases any more.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This is now implemented in the ctdb daemon using VACUMM_FETCH control.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This occurs rarely but can adversely impact performance, so it is
worth logging it more frequently.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This currently skips the last record.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14147
RN: Avoid potential data loss during recovery after vacuuming error
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Currently some of this is supported by a periodic check in the
recovery daemon's main_loop(), which notices the flag change, sets
recovery mode active and freezes databases. If STOP_NODE returns
immediately then the associated recovery can complete and the node can
be continued before databases are actually frozen.
Instead, immediately do all of the things that make a node inactive.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087
RN: Stop "ctdb stop" from completing before freezing databases
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Aug 20 08:32:27 UTC 2019 on sn-devel-184
There's no reason to avoid immediately setting recovery mode to active
and initiating freeze of databases.
This effectively reverts the following commits:
d8f3b490bbb4357a79d9
The latter is now implemented using a control, resulting in looser
coupling.
See also the following commit:
f8141e91a6
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is a superset of ctdb_local_node_got_banned() so will replace
that function, and will also be used in the NODE_STOP control.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is the core logic from ctdb_ip_to_pnn(), so re-implement that
that function using ctdb_ip_to_node().
Something similar (ctdb_ip_to_nodeid()) was recently removed in commit
010c1d77cd because it wasn't required.
Now there is a use case.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Compiling with -Wsign-compare complains:
1047 | && (call->call_id == CTDB_FETCH_WITH_HEADER_FUNC)) {
| ^~
struct ctdb_call is a protocol element, so we can't simply change it.
Found by csbuild.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Aug 14 10:29:59 UTC 2019 on sn-devel-184
Compiling with -Wsign-compare complains:
ctdb/server/ctdb_call.c:831:12: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
831 | if (count <= ctdb_db->statistics.hot_keys[0].count) {
| ^~
and
ctdb/server/ctdb_call.c:844:13: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
844 | if (count <= ctdb_db->statistics.hot_keys[i].count) {
| ^~
Found by cs-build.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If the lock file is inaccessible or the inode number changes then the
lock is lost, so exit. This allows the recovery daemon to trigger an
election. The ensuing recovery will re-take the lock.
By default the lock file is checked every 60 seconds. A lot can
happen in 60 seconds but being more aggressive and accessing the lock
too often could result in a performance issue for the cluster
filesystem.
An new optional 2nd argument is added, which is the lock file re-check
time in seconds.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This will allow more conditions to be waited on via additional
sub-requests. At the moment this just completes when the parent wait
completes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Put the checking for the process being immediately re-parented into
the computation too. This will be very rare and doing it
consistently makes testing saner.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There is no need to wait until the parent kills the helper. The
parent will get the initial response, indicating contention or
similar, and will then get a separate event indicating that the pipe
is gone.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes the code more explicit and makes testing easier due to less
dependencies.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
clang warns:
ctdb/server/ctdb_mutex_fcntl_helper.c:61:3: warning: Value stored to 'fd' is never read
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
One of these had a missing space, so this implicitly fixes it. It
also drops the need to unnecessarily include common.h, which comes
with some dependency baggage.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Only do this if the recovery lock is unset. Log every minute for the
first 10 minutes, then every 10 minutes, then every hour.
This is useful for determining whether a split brain occurred. It is
particularly useful if logging failed or was throttled at startup, so
there is no evidence of the split brain when it began.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This can never be NULL. It could probably be NULL in the past when
"all database" locks existed.
There are paths where is is checked for NULL and then later
dereferenced, causing static analysers to produce spurious warnings.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Indexing by PNN is wrong.
This also removes a signed/unsigned comparison because the PNN is not
compared to -1 anymore.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Node ID is a poorly defined concept, indicating the slot in the node
map where the IP address was found. This signed value also ends up
compared to num_nodes, which is unsigned, producing unwanted warnings.
Just return the PNN because this what both callers really want.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
... and does not just contain whitespace.
Otherwise NULL can be passed as the first argument to execv().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>