IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
It is pointless having a recovery lock but not sanity checking that it
is working. Also, the logic that uses this tunable is confusing. In
some places the recovery lock is released unnecessarily because the
tunable isn't set.
Simplify the logic by assuming that if a recovery lock is specified
then it should be verified.
Update documentation that references this tunable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If the recovery lock file is unset then this dereferences a NULL
pointer. The regression is due to commit
6f1ac7af0f.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Print out the errno if the fcntl call.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Richard Sharpe <rsharpe@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Fri Jan 9 04:25:02 CET 2015 on sn-devel-104
Log a message when the reclock file actually changes and avoid a
memory allocation when it doesn't change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
ctdb_sys_find_ifname() doesn't work for IPv6 addresses so don't use
it.
Trust the eventscript to do sanity checking on the interface. Current
warnings are replaced with equivalents generated by the eventscript.
The unlikely message:
Public IP %s is hosted on interface %s but we have no VNN
will be replaced by:
WARNING: Public IP %s hosted on interface %s but VNN says __none__
which is clear enough.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Processing one migration request at a time is very slow and processing
a batch of records can take longer than VacuumInterval. This causes
subsequent vacuum fetch requests to be dropped. The dropped records
can accumulate quickly and will cause the vacuum database traverse to
be quite expensive.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Dec 5 17:06:58 CET 2014 on sn-devel-104
Such records should be processed by the local vacuuming daemon to ensure
that all the remote copies have been deleted first.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This avoids vacuuming getting in the way of ctdb daemon to process
record requests.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This avoids vacuuming getting in the way of ctdb daemon to process
record requests.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This prevents multiple child processes being forked at the same time
for vacuuming TDBs.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Nov 14 03:06:12 CET 2014 on sn-devel-104
Some implementations may not understand RC3164 format messages on the
UDP socket, so add support for RFC5424 message format.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This has most of the advantages of the old logd with none of the
complexity of the extra process. There are several good syslog
implementations that can listen on the UDP port.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Remove --logfile and --syslog daemon options and replace with
--logging.
Modularise and clean up logging initialisation code. The
initialisation API includes an app_name argument that is currently
unused - this will be used in extensions to the syslog backend.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It is much simpler for most cases to have a syslog backend that
doesn't need a separate CTDB-specific logging daemon. This loses the
lossy, non-blocking mode provided by logd. However, a corresponding
feature with a completely different implemention (not requiring an
extra daemon) will be re-added into the syslog backend. In an ideal
world the new implementation would be added first but unfortunately
that is hard to do because the logd code is hooked in at more than one
place.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes the code cleaner and allows the syslog backend to be easily
modified without affecting other code. Also do some extra clean-up,
including whitespace fixups.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is set but otherwise not used. This allows the 1st argument to
ctdb_set_logfile() to be generalised to a TALLOC_CTX.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is only used by logging code and there is already a file-level
variable for this. struct ctdb_context already contains too many
things.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Now it is obvious that it has something to do with child processes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Internally map them to DEBUG_ERR to limit code churn.
This reduces the unwieldy number of debug levels used by CTDB. ALERT
and CRIT aren't of much use as separate errors, since everything from
ERR up should always be logged. In future just ERR can be used.
This also improves compatibility with Samba's debug.c system priority
mapping.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It isn't used and shouldn't be. CTDB can't make the system unusable.
Update associated test to ensure that EMERG isn't attempted. Actually
test all remaining debug levels and modernise the test a bit.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This got lost with the transition to the new Samba debug code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This avoids a clash with Samba's BINDIR and also makes it easier to
move the helpers to somewhere else (e.g. libexec) in the future.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Samba's debug subsystem has changed a lot, so CTDB's logging needs
to be rewritten to be compatible.
The new debug.h/debug.c can't just be pulled in because it has some
extra dependencies into Samba's lib/util. For now, to support the
smallest possible patch, implement a minimal subset of Samba's
debug.[ch] that just supports the DEBUG_CALLBACK logtype.
Define a callback for each logging method.
Check later to see if debug_extra (or similar) can somehow be
implemented using debug classes.
The timestamp on CTDB CLI tool and test program DEBUG() output goes
away, so update the unit test code to cope.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
As far as we know, nobody uses this and it just complicates the
logging subsystem.
Remove all ringbuffer code and documentation. Update the local
daemons startup code correspondingly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
When ctdb daemon starts up, it considers itself the recovery master
and tries to do first recovery. However, it's possible that there is
already a recovery master and the current node has not yet heard from it.
So do not ban ourselves immediately if ctdb_recovery_lock() fails when
doing first recovery.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
When timer expires, timeout handler routine sets lock_ctx->ttimer
to a newly created timer event. However, when a node is INACTIVE,
timeout handler returns early with lock_ctx->ttimer set to the previous
timer event. This timer event gets freed when the callback returns and
lock_ctx->ttimer remains set to already freed timer event.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Deferred calls should not be treated as pending calls since they are
re-processed from the beginning.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Otherwise errors printed by the lock helper get lost.
lock_helper_args() no longer adds the program name to the list of
arguments, since vfork_with_logging() does that. Update the lock
helper to handle the extra log_fd parameter passed by
vfork_with_logging() and send stdout/stderr there.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To make this sane, also add an argv parameter and change the return
type to bool. Anticipating a subsequent change, make the type of argv
match what is needed by vfork_with_logging() and cast it when passing
to execv(). This also means changing the type of the name member of
struct db_namelist.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To avoid lock helper starvation when userspace robust mutexes are
enabled.
Commit 6f072f85a1 removed reset_scheduler(),
to avoid resetting scheduler priority. However, that is not sufficient
because of commit 1be8564e55, which sets
SCHED_RESET_ON_FORK flag. With SCHED_RESET_ON_FORK, all CTDB child
processes will automatically have normal scheduling priority.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Sep 11 11:31:10 CEST 2014 on sn-devel-104
This makes it consistent with Samba, to ease transition.
Update unit test code to link to with tdb_wrap instead of including
db_wrap.c.
There are some potential whitespace fixes in this commit that have
been ignored. CTDB's lib/tdb_wrap will be deleted after the
transition to Samba's lib/tdb_wrap, so there's no point polishing it
too much.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes it consistent with the rest of the code and avoids problems
when some variant of lib/util isn't in the include path.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Samba's version doesn't accept an argument, so this aids a smooth
transition.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is part of a migration to Samba's lib/util. CTDB always passes 0
(i.e. no max_size) so use a simple assert() to enforce this, rather
than changing a lot of code that will be discarded anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is the only place it is used.
After migrating to Samba's lib/util, the lock helper can be changed to
use strhex_to_data_blob().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To avoid warnings when using --enable-developer, which uses
-Wmissing-prototypes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To avoid warnings when using --enable-developer, which uses
-Wmissing-prototypes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Deferring packets has a nasty interaction with recovery. All deferred
packets must be dropped when recovery happens, since those packets are
tracked as pending requests and will be re-sent with new generation.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Sep 5 09:30:50 CEST 2014 on sn-devel-104
When using TDB robust mutexes, the kernel wakes waiting processes one
by one, in the priority list order. To ensure that ctdb lock helper
processes do not starve, lock helper processes need to run at a higher
priority than smbd.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
When CTDB receives DMASTER_REQUEST or DMASTER_REPLY packet, the specified
record needs to be updated as soon as possible to avoid inconsistent
dmaster information between nodes. During this time, queue up all calls
for that record and process them only after dmaster request/reply has
been processed.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
There is no need for a special function to free lock request and
corresponding lock context. Freeing lock request will free lock
context also.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This makes sure that when the client context is destroyed, the lock
request goes away. If the lock requests is already scheduled, then the
lock child process will be terminated.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Seeing these with -Wall:
../server/ctdb_call.c:1117:3: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
record_flags = *(uint32_t *)&c->data[c->keylen + c->datalen];
^
memcpy() seems to be the easiest way to get fix these. The
alternative would be to use unmarshalling functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Revoking readonly record involves first marking the record on dmaster as
RO_REVOKING_READONLY. Then all the other nodes are sent update_record
control to get rid of RO_DELEGATION. Once that succeeds, the record
is marked RO_REVOKING_COMPLETE.
Currently, revoking of readonly delegations on the nodes is tried only
once. If a node goes in recovery, it can fail update_record control and
revoke code will abort ctdb. Since database recovery would revoke all
readonly delegations anyway, there is no reason to abort. Simply undo
the start of revoke process by resetting RO_REVOKING_READONLY flag.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Aug 13 11:24:09 CEST 2014 on sn-devel-104
This patch makes the subsequent logic change small and easier to
understand.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
I like early returns that avoid else branches :-)
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Aug 6 14:44:31 CEST 2014 on sn-devel-104
This avoids traversing a single pending queue which is quite expensive
when there are lots of pending lock requests. This seems to happen
quite a lot on a loaded cluster for notify_index.tdb.
Adding per database queues avoids the need to traverse pending queue
for that database if there are already the maximum number of active
lock requests.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Mon Aug 4 20:23:45 CEST 2014 on sn-devel-104
This allows to schedule DB locks quickly without having to scan through
the pending lock requests.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
The number of pending locks displayed in ctdb statistics are stored in
ctdb_statistics structure and not ctdb_context.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
This allows to change the maximum number of lock processes that can
be active.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
This avoids extra work in case lock request allocation fails.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
This prevents searching through active lock requests for every pending
lock request to check if the pending lock request can be scheduled or not.
The locks are scheduled in strict first-in-first-out order.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Store only a single request instead of storing a queue in lock context.
Lock request structure does not need to be a linked list any more.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
This was a bad idea and caused out of order scheduling of lock requests.
The logic to append lock requests to existing lock context is already
commented. Remove the commented code and there is no need to check if
lock_ctx is NULL, since we are always creating a new one.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
block_child was used to keep track of a process which was created to debug
why a lock process has blocked. That logic was replaced to execute an
external debug script.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
This indirect caller of delete_marshall_traverse was missed
in fa4a81c86b
which lets failure of the second travers fail the vacuum run.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This avoids duplicate code and extra talloc in ctdb_marshall_record.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Sometimes the recovery daemon fails to get the recovery lock on one
node so that node is banned. This seems to always happen during an
election. The recovery is triggered because other nodes are found to
have recovery mode enabled. They have recovery mode enabled because
an election has been forced.
The recovery daemon's main_loop() only does an initial check for an
election. After that, a node can force an election and, in the
process, set itself to be the current winner. In this situation,
verify_recmode() will always return MONITOR_RECOVERY_NEEDED so
do_recovery() is called. If the previous recovery master hasn't
admitted defeat and released the recovery lock, then do_recovery()
will rightly fail. However, it would be better if it failed a little
more gracefully, since this case is not that unusual.
Instead of trying to take the recovery lock, return early with an
error if there is an election in progress. Note that the race is
still there but it is now much narrower.
There are probably more subtle ways of avoiding this issue, including
something like this in main_loop():
- if (pnn != rec->recmaster) {
+ if (pnn != rec->recmaster || rec->election_timeout) {
return;
}
However, this check is done earlier so it leaves the race window open
a little wider.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Jul 21 06:57:07 CEST 2014 on sn-devel-104
To enable TDB mutex support, set tunable TDBMutexEnabled=1.
When databases are attached for the first time, attach flags must include
TDB_MUTEX_LOCKING and TDBMutexEnabled must set to enable mutex support.
However, when CTDB attaches databases internally for recovery, it will
enable mutex support if TDBMutexEnabled is set.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Jul 9 06:45:17 CEST 2014 on sn-devel-104
Runtime check for robust mutexes is performed just before opening local tdb.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
This prevents ctdb tool from thawing databases prematurely in
thaw/wipedb/restoredb commands if recovery is active.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Setting recovery mode to active is the only correct way to inform recovery
daemon to run database recovery. Only freezing databases without setting
recovery mode should not trigger database recovery, as this mechanism
is used in tool to implement wipedb/restoredb commands.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
This reverts commit 6578a97bd9.
This condition cannot happen since when recovery is triggered, all the
databases would get frozen and thawed in the order of priority. The only
other place where databases get frozen are for implementation of ctdb
wipedb/restoredb commands.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
That makes people think there's a problem (and report bugs) so say
something a bit less scary instead...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It is a non-trivial event and will make it easier to debug recovery
lock issues.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is unnecessary since ctdbd_pid is set very early in the code before
creating any other processes including recovery daemon.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Sat Jul 5 09:20:27 CEST 2014 on sn-devel-104
This duplicates ctdb->ctdbd_pid.
Thanks to Sumit Bose <sbose@redhat.com> for the suggestion.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If something unexpectedly uses fork() then an exiting child will
remove the PID file while the main daemon is still running. The real
test is whether the current process has the PID of the main CTDB
daemon, which is the process that calls setsid().
This could be done using getpgrp() instead. At the moment the
eventscript handler harmlessly calls setpgid() - harmless because the
atexit() handlers are cleared upon exec(). However, it is possible
that process groups will be used more in future so it is probably
better to rely on the session ID.
Thanks to Sumit Bose <sbose@redhat.com> for the idea.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Currently ctdbd_wrapper depends on the session ID. Very soon PID file
removal will too. :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This was useful for debugging the race fixed by commit
4f79fa6c7c. It might be useful again.
Also fix a nearby comment typo.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jun 20 02:07:48 CEST 2014 on sn-devel-104
and not only if repack_limit != 0. This partially reverts
commit 48f2d11588.
With the new tdb code this defragments the
free list by merging adjacent records.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This got lost in commit 1994870299
("ctdb-vacuum: make ctdb_vacuum_traverse_db() void.")
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is a small code cleanup.
vdata is only used in ctdb_vacuum_db() and not in
ctdb_vacuum_and_repack_db() where it is currently initialized.
This patch moves creation and all previously scattered
initialization of vacuum_data into ctdb_vacuum_init_vacuum_data
which is called from ctdb_vacuum_db.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Since we usually have 0 records left for repack-deletion,
repacking is essentially used for the purpose of defragmenting
the freelist, we can use the vanilla tdb_repack function.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The repack operation now mainly defragments the freelist
and does not usually delete any records any more.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Now we usually have records to delete == 0 after the preceding
vacuum run. Anyways, deletion is not a major aspect any more
of the repack run and will vanish soon.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Do not run helper processes with real-time priority.
This regression was caused when locking and eventscript code switched
to use vfork() and helper instead of ctdb_fork().
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Jun 12 08:10:36 CEST 2014 on sn-devel-104
This function does not block signals, but ignores them.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Once CTDB is daemonized, it starts ignoring SIGPIPE anyway.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jun 5 19:51:36 CEST 2014 on sn-devel-104
It might as well be near where it is used. Add a comment explaining
it.
Also add/update comments at the top of the RELEASE_IP and TAKEOVER_IP
loops to explain what is happening.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon May 5 06:20:39 CEST 2014 on sn-devel-104
As part of vacuuming, recoverd attaches to databases to migrate records.
When detaching a database from main daemon, it should be removed from
recovery daemon also.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Wed Apr 23 17:05:45 CEST 2014 on sn-devel-104
This will ensure that when ctdb_db is freed, it will close the tdb
database.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
This avoids the server detaching a database if clients are allowed to
connect to databases.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
Database priority is a global property and all the nodes should have the
priority set for the databases. Just setting priority on one node can
lead to problems in the recovery as a database can be frozen at wrong
priority and then freezing database would not succeed.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Autobuild-User(master): David Disseldorp <ddiss@samba.org>
Autobuild-Date(master): Mon Apr 7 14:06:26 CEST 2014 on sn-devel-104
Signed-off-by: Gregor Beck <gbeck@sernet.de>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Tue Apr 1 02:59:05 CEST 2014 on sn-devel-104
In case of control timeouts, readonly revoke code currently aborts. This
needs to be fixed. Meanwhile, using control_timeout instead of 5 seconds,
increases the timeout to 60 seconds.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Mar 31 07:20:48 CEST 2014 on sn-devel-104
This replaces memory comparison of the key with integer comparison.
In addition, this also avoids scheduling locks with the same hash.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Mar 28 05:28:58 CET 2014 on sn-devel-104
If lock_request could not be allocated, free lock_ctx since there can
only be a single lock_request per lock_ctx.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Previous commits maintained the ordering between
ctdb_remove_orphaned_ifaces() and ctdb_vnn_unassign_iface(). This
meant that ctdb_remove_orphaned_ifaces() needed to steal the orphaned
interfaces and they would be freed later.
Unassign the interface first and things get simpler.
ctdb_remove_orphaned_ifaces() is now self-contained.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Sun Mar 23 06:20:43 CET 2014 on sn-devel-104
reloadips really expects deleted IPs to be released before completing.
Otherwise the recovery daemon starts failing the local IP check. The
races that follow can cause a node to be banned.
To make the error handling simple, do the actual deletion in
release_ip_callback().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is racy and cbffbb7c2f makes it
unnecessary.
The eventscript code still knows that monitor events are special
compared to other events. However, the general concept of monitoring
is no longer tangled up with running scripts.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Commit 0723fedced added a cheap
implemention of ctdb_control_startup() that simply flags the recipient
node as needing to send updates for each IP when the tickle update
loop next fires. Commit 026996550d
ensures that a node only sends tickle updates once being flagged to do
so.
CTDB_CONTROL_STARTUP is broadcast to all nodes, so this is a good
start. However, the tickle updates are only broadcast to connected
nodes. A recently started node may not yet be considered to be
connected because the keepalive monitoring loop may not yet have
marked the node as connected. This means that the tickle update loop
races with the keepalive monitoring loop. If the tickle update loop
wins then updates will not be sent to the recently started node.
The simplest improvement is to stop the tickle update from depending
on whether a node is connected or not. So instead of broadcasting
tickle updates to connected nodes, they are broadcast to all nodes.
Since no reply is expected, this should work just fine.
While looking at this code, ctdb_ctrl_set_tcp_tickles() is named like
a client function. It isn't a client function. Also, 2 of the
arguments are ignored. So rename this function to
ctdb_send_set_tcp_tickles_for_ip() and remove the ignored arguments.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
when bumping skipped, decrement left, so the sum is correct
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Mar 6 03:32:33 CET 2014 on sn-devel-104
We need to have left records == 0 at the end of the delete list processing.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Failure in traversal of the DB should not
prevent further processing.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
We should try to continue vacuuming as much as possible.
Failure to send records to one lmaster doesn't mean the
others will fail too.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>