IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The path of the TDB is known, so calculate the file ID (device number
+ inode number) from it and use this to directly filter /proc/locks to
find processes holding locks.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Don't use the arguments yet. They will be used in a simplified
version of the code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
1. PID of lock helper waiting for lock
2. Scope of lock: "record" or "db"
3. Path to database that lock helper is trying to lock
4. Whether the database uses mutexes: "mutex" or "fcntl"
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
These were fine (though still lazy) when these tests were the only
user of this stub. However, the ps stub is about to be enhanced, so
fix these uses of it to represent the intended usage.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The main reason for this is to facilitate testing.
Avoid some /proc accesses entirely by using ps(1) (which can be
replaced by a stub when testing) because this script might as well be
more portable in case anyone wants to add lock debugging for a
non-Linux platform. While the "state" format specification isn't
POSIX-compliant, it works on both Linux and FreeBSD so it is a
reasonable improvement.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If a script times out the caller can talloc_free() the script_list
output of run_event_recv, which talloc_free's proc->output from
run_proc.c as well. If the script generates further output after the
timeout and then exits after a while, the SIGCHLD handler in the
eventd tries to read into proc->output, which was already free'ed.
Fix this by not doing just a talloc_steal but a talloc_move. This way
proc_read_handler() called from run_proc_signal_handler() does not try
to realloc the stale reference to proc->output but gets a NULL
reference.
I don't really know how to do a knownfail in ctdb, so this commit
actually activates catching the signal by waiting long enough for
22.bar to exit and generate the SIGCHLD.
Bug: https://bugzilla.samba.org/show_bug.cgi?id=14475
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
This will lead to a crash in run_event_test.c soon
Bug: https://bugzilla.samba.org/show_bug.cgi?id=14475
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
Triggers a different code path in run_event_* and aligns it more what
the ctdb eventd really does.
Bug: https://bugzilla.samba.org/show_bug.cgi?id=14475
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
Use F_GETLK to get the lock holder PID, this is more accurate than
reading the file contents: A conflicting process might not have
written its PID yet. Also, F_GETLK easily allows to do a retry if the
lock holder just died.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
This test has been failing with:
Wait until record is migrated to lmaster node 0
<30|BAD: node 0 is not dmaster
dmaster: 1
rsn: 8
flags: 0x00010000 MIGRATED_WITH_DATA
data(6) = "value1"
*** TEST COMPLETED (RC=1) AT 2021-02-02 06:18:48, CLEANING UP...
This should never happen. If this really fails then the wait should
time out.
The problem is that wait_until() does:
"$@" || _rc=$?
and vacuum_test_key_dmaster() currently calls ctdb_test_fail() on
failure, which causes the shell to exit. Instead, pass a variant to
wait_until() that simply returns the correct status instead of
exiting.
An alternative would be to change the statement in wait_until() to do:
("$@") || _rc=$?
so it captures the exit. However, this is a global change and
requires more thought.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Jeremy Allison <jra@samba.org>
This is helpful if you are in a listening loop with the same receiver
for many sockets doing the same thing.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
If run with UID wrapper and UID_WRAPPER_ROOT=1 then securing the
socket will fail.
Test mode means that local daemons are in use, so securing the socket
is not important.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Variable res is only used once and ret is re-used many times. Drop
res, use ret, which doesn't need to be initialised. Modernise debug
macro.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
When compiling with GCC 10.x and -O3 optimization, the IP checksum
calculation code generates wrong checksum. The function uint16_checksum
gets inlined during optimization and ip4pkt->tcp data gets wrongly
aliased.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14537
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Oct 21 05:52:28 UTC 2020 on sn-devel-184
Check that the desired state is set on all nodes instead of just the
test node. This ensures that node flags have correctly propagated
across the cluster.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14513
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Oct 6 04:32:06 UTC 2020 on sn-devel-184
update_flags() has already updated the recovery master's canonical
node map, based on the flags from each remote node, and pushed out
these flags to all nodes.
If i == j then the node map has already been updated from this remote
node's flags, so simply drop this case.
Although update_flags() has updated flags for all nodes, it did not
update each node map in remote_nodemaps[] to reflect this. This means
that remote_nodemaps[] may contain inconsistent flags for some nodes
so it should not be used to check consistency when i != j.
Further, a meaningful difference in flags can only really occur if
update_flags() failed. In that case this code is never reached.
These observations combine to imply that this whole loop should be
dropped.
This leaves potential sub-second inconsistencies due to out-of-band
healthy/unhealthy flag changes pushed via CTDB_SRVID_PUSH_NODE_FLAGS.
These updates could be dropped (takeover run asks each node for
available IPs rather than making centralised decisions based on node
flags) but for now they will be fixed in the next iteration of
main_loop().
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14513
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This has already been done in update_flags().
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14513
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Samuel Cabrero <scabrero@samba.org>
Autobuild-User(master): David Disseldorp <ddiss@samba.org>
Autobuild-Date(master): Thu Sep 24 00:52:42 UTC 2020 on sn-devel-184
The Ceph Manager's service map is useful for tracking the status of
Ceph related services. By registering the CTDB recovery lock holder,
Ceph storage administrators can more easily identify where and when a
CTDB cluster is up and running.
Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Samuel Cabrero <scabrero@samba.org>
This is no longer necessary because the capability new style database
pull is assumed to always be available.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Removes use of the old controls without cleaning up the code. Clean
up can be done later.
After this change the CTDB_CAP_FRAGMENTED_CONTROLS capability is no
longer checked. This capability can be removed along with the
controls.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The new waf detects a duplicate instance of
ctdb_mutex_ceph_rados_helper.7.xml, which is due
to manpages_extra being a pointer to
manpages_misc, therefore each call to build()
added duplicate entries to the manpages_misc
global entry.
Signed-off-by: David Mulder <dmulder@suse.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
This makes it consistent with the monitoring code. If the master has
changed then this means the master will always get the message.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Aug 18 06:24:11 UTC 2020 on sn-devel-184
This also updates remote flags so the name is misleading.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
update_local_flags() will be changed to use these nodemaps.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The nodemap has already been fetched from the local node and is
actually passed to this function. Care must be taken to avoid
referencing the "remote" nodemap for the recovery master. It also
isn't useful to do so, since it would be the same nodemap.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The plan here is to use the nodemaps retrieved by get_remote_nodes()
in update_local_flags(). This will improve efficiency, since
get_remote_nodes() fetches flags from nodes in parallel. It also
means that get_remote_nodes() can be used exactly once early on in
main_loop() to retrieve remote nodemaps. Retrieving nodemaps multiple
times is unnecessary and racy - a single monitoring iteration should
not fetch flags multiple times and compare them.
This introduces a temporary behaviour change but it will be of no
consequence when the above changes are made.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>