IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
pnn and nodemap are both available via the rec context, so simplify.
vnnmap is unused.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The pnn and nodemap arguments to force_election() and
send_election_request() are always effectively rec->pnn and
rec->nodemap, so simplify.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is currently referenced in a number of inconsistent
ways, including:
* pnn
* rec->ctdb->pnn
* ctdb->pnn
* ctdb_get_pnn(ctdb)
* ctdb_get_pnn(rec->ctdb)
The first of these always requires some thought about the context - is
this the node PNN or some other PNN (e.g. argument to function)?
The intention is to always use rec->pnn when referring to the recovery
daemon's PNN.
Doing this also reduces reliance on struct ctdb_context internals.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Make the code self-documenting.
This preempts an upcoming change to terminology but doing it now saves
a lot of churn.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The recovery and takeover helpers can run for a while and generate
non-trivial logs, so have them reopen their logs to support log
rotation.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Jan 17 04:36:30 UTC 2022 on sn-devel-184
CTDB_SRVID_SET_NODE_FLAGS is no longer sent so drop monitor_handler()
and replace with srvid_not_implemented(). Mark the SRVID obsolete in
its comment.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This will usually happen if flags on the node in question change, so
keeping the code simple and pushing to all nodes won't hurt. When all
nodes come up there might be differences in connected nodes, causing
such "fix ups". Receiving nodes will ignore no-op pushes.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The resulting code structure looks a little weird. However, there is
another condition that requires the flags to be pushed that will be
inserted before the continue statement in a subsequent commit..
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
update_flags() has already updated the recovery master's canonical
node map, based on the flags from each remote node, and pushed out
these flags to all nodes.
If i == j then the node map has already been updated from this remote
node's flags, so simply drop this case.
Although update_flags() has updated flags for all nodes, it did not
update each node map in remote_nodemaps[] to reflect this. This means
that remote_nodemaps[] may contain inconsistent flags for some nodes
so it should not be used to check consistency when i != j.
Further, a meaningful difference in flags can only really occur if
update_flags() failed. In that case this code is never reached.
These observations combine to imply that this whole loop should be
dropped.
This leaves potential sub-second inconsistencies due to out-of-band
healthy/unhealthy flag changes pushed via CTDB_SRVID_PUSH_NODE_FLAGS.
These updates could be dropped (takeover run asks each node for
available IPs rather than making centralised decisions based on node
flags) but for now they will be fixed in the next iteration of
main_loop().
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14513
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This has already been done in update_flags().
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14513
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes it consistent with the monitoring code. If the master has
changed then this means the master will always get the message.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Aug 18 06:24:11 UTC 2020 on sn-devel-184
This also updates remote flags so the name is misleading.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
update_local_flags() will be changed to use these nodemaps.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The nodemap has already been fetched from the local node and is
actually passed to this function. Care must be taken to avoid
referencing the "remote" nodemap for the recovery master. It also
isn't useful to do so, since it would be the same nodemap.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The plan here is to use the nodemaps retrieved by get_remote_nodes()
in update_local_flags(). This will improve efficiency, since
get_remote_nodes() fetches flags from nodes in parallel. It also
means that get_remote_nodes() can be used exactly once early on in
main_loop() to retrieve remote nodemaps. Retrieving nodemaps multiple
times is unnecessary and racy - a single monitoring iteration should
not fetch flags multiple times and compare them.
This introduces a temporary behaviour change but it will be of no
consequence when the above changes are made.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This array is indexed by the same index as nodemap, not the PNN.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Also drop error handling in main_loop() that is replaced by this
change.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This will allow an error callback to be added.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Change 1st argument to a rec context, since this will be needed later.
Drop the nodemap argument and access it via rec->nodemap instead.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The memory is allocated off the memory context used by the current
iteration of main loop. It is freed when main loop completes the fix
doesn't require backporting to stable branches. However, it is sloppy
so it is worth fixing.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Don't log an error on failure - let the caller can do this. Apart
from this: fix up coding style and modernise the remaining error
message.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The logic currently in ctdb_ctrl_modflags() will be optimised so that
it no longer matches the pattern for a control function. So, remove
this function and squash its functionality into the only caller.
Although there are some superficial changes, the behaviour is
unchanged.
Flattening the 2 functions produces some seriously weird logic for
setting the new flags, to the point where using ctdb_ctrl_modflags()
for this purpose now looks very strange. The weirdness will be
cleaned up in a subsequent commit.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is clearer than using the MODFLAGS control directly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes fields such as recmaster and nodemap easily available if
required.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
An unused argument needlessly extends the length of function calls. A
subsequent change will allow rec->nodemap to be used if necessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Inside the nested event loop in ctdb_ctrl_getnodemap(), various
asynchronous handlers may dereference rec->nodemap, which will be
NULL.
One example is lost_reclock_handler(), which causes rec->nodemap to be
unconditionally dereferenced in list_of_nodes() via this call chain:
list_of_nodes()
list_of_active_nodes()
set_recovery_mode()
force_election()
lost_reclock_handler()
Instead of attempting to trace all of the cases, just avoid leaving
rec->nodemap set to NULL. Attempting to use an old value is generally
harmless, especially since it will be the same as the new value in
most cases.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14324
Reported-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Mar 24 01:22:45 UTC 2020 on sn-devel-184
The only reason for recoverd attaching to databases was to migrate
records to the local node as part of vacuuming. Recovery daemon does
not take part in database vacuuming any more.
The actual database recovery is handled via the recovery_helper and
recovery daemon should not need to attach to the databases any more.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This is now implemented in the ctdb daemon using VACUMM_FETCH control.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Only do this if the recovery lock is unset. Log every minute for the
first 10 minutes, then every 10 minutes, then every hour.
This is useful for determining whether a split brain occurred. It is
particularly useful if logging failed or was throttled at startup, so
there is no evidence of the split brain when it began.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Simple cases where variables need to be declared as an unsigned type
instead of an int.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>