IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
==20761== Invalid read of size 8
==20761== at 0x11BE30: ctdb_ctrl_dbstatistics (ctdb_client.c:1286)
==20761== by 0x12BA89: control_dbstatistics (ctdb.c:713)
==20761== by 0x1312E0: main (ctdb.c:6543)
==20761== Address 0x713b0d0 is 0 bytes after a block of size 560 alloc'd
==20761== at 0x4C27A2E: malloc (vg_replace_malloc.c:270)
==20761== by 0x5CB0954: _talloc_memdup (talloc.c:615)
==20761== by 0x11395C: ctdb_control_recv (ctdb_client.c:1146)
==20761== by 0x11BDD7: ctdb_ctrl_dbstatistics (ctdb_client.c:1265)
==20761== by 0x12BA89: control_dbstatistics (ctdb.c:713)
==20761== by 0x1312E0: main (ctdb.c:6543)
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
It is important to know when ctdbd is started with --start-as-stopped
or --start-as-disabled. Given that this only happens once it makes
sense to promote these debug items to NOTICE level.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If a node was previously recovery master (say, 20 years ago) and it
becomes recovery master again then, if IP assignments have changed,
verify_remote_ip_allocation() can produce messages like the following
when called during recovery:
ctdbd: recoverd:Inconsistent IP allocation - node 0 thinks 10.1.1.1 is held by node 0 while it is assigned to node 1
When a node loses an election it should clear all data specific to it
being the recovery master.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This needs to be cleared to avoid stale data when a new recovery
master is elected.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This was used to unlink the socket at daemon exit, which
was removed in ctdb commit b18356764cd49d934eab901e596bb75c6e3ecdf8
(Samba master commit 4259156050).
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Thu Jun 25 18:29:59 CEST 2015 on sn-devel-104
This reverts commit 257311e337.
That commit was due to a misunderstanding, and it
does not fix what it was supposed to fix.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The caller should not dereference lock_ctx after invoking
process_callbacks(), it might be destroyed already.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jun 12 15:28:57 CEST 2015 on sn-devel-104
We should not dereference lock_ctx after invoking the callback
in the auto_mark == false case. The callback could have destroyed it.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Let ctdb_lock_request_destructor() take care of the proper cleanup.
If the request if freed from the callback function, then the lock context
should not be freed. Setting request->lctx to NULL takes care of that
in the destructor.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
There is already code in the destructor to correctly remove it from the
pending or the active queue. This also ensures that when lock context
is in pending queue and if the lock request gets freed, the lock context
is correctly removed from the pending queue.
Thanks to Stefan Metzmacher for noticing this and suggesting the fix.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
The code was added to ctdb_lock_context_destructor() to ensure that
the if a lock_ctx gets freed first, the lock_request does not have a
dangling pointer. However, the reverse is also true. When a lock_request
is freed, then lock_ctx should not have a dangling pointer.
In commit 374cbc7b0f, the code for second
condition was dropped causing a regression.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If the lock request is freed from within the callback, then setting
lock_ctx->request to NULL in ctdb_lock_context_destructor will end up
corrupting memory. In this case, lock_ctx->request could be reallocated
and pointing to something else. This may cause unexpected abort trying
to dereference a NULL pointer.
So, set lock_ctx->request to NULL before processing callbacks.
This avoids the following valgrind problem.
==3636== Invalid write of size 8
==3636== at 0x151F3D: ctdb_lock_context_destructor (ctdb_lock.c:276)
==3636== by 0x58B3618: _talloc_free_internal (talloc.c:993)
==3636== by 0x58AD692: _talloc_free_children_internal (talloc.c:1472)
==3636== by 0x58AD692: _talloc_free_internal (talloc.c:1019)
==3636== by 0x58AD692: _talloc_free (talloc.c:1594)
==3636== by 0x15292E: ctdb_lock_handler (ctdb_lock.c:471)
==3636== by 0x56A535A: epoll_event_loop (tevent_epoll.c:728)
==3636== by 0x56A535A: epoll_event_loop_once (tevent_epoll.c:926)
==3636== by 0x56A3826: std_event_loop_once (tevent_standard.c:114)
==3636== by 0x569FFFC: _tevent_loop_once (tevent.c:533)
==3636== by 0x56A019A: tevent_common_loop_wait (tevent.c:637)
==3636== by 0x56A37C6: std_event_loop_wait (tevent_standard.c:140)
==3636== by 0x11E03A: ctdb_start_daemon (ctdb_daemon.c:1320)
==3636== by 0x118557: main (ctdbd.c:321)
==3636== Address 0x9c5b660 is 96 bytes inside a block of size 120 free'd
==3636== at 0x4C29D17: free (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3636== by 0x58B32D3: _talloc_free_internal (talloc.c:1063)
==3636== by 0x58B3232: _talloc_free_children_internal (talloc.c:1472)
==3636== by 0x58B3232: _talloc_free_internal (talloc.c:1019)
==3636== by 0x58B3232: _talloc_free_children_internal (talloc.c:1472)
==3636== by 0x58B3232: _talloc_free_internal (talloc.c:1019)
==3636== by 0x58AD692: _talloc_free_children_internal (talloc.c:1472)
==3636== by 0x58AD692: _talloc_free_internal (talloc.c:1019)
==3636== by 0x58AD692: _talloc_free (talloc.c:1594)
==3636== by 0x11EC30: daemon_incoming_packet (ctdb_daemon.c:844)
==3636== by 0x136F4A: lock_fetch_callback (ctdb_ltdb_server.c:268)
==3636== by 0x152489: process_callbacks (ctdb_lock.c:353)
==3636== by 0x152489: ctdb_lock_handler (ctdb_lock.c:468)
==3636== by 0x56A535A: epoll_event_loop (tevent_epoll.c:728)
==3636== by 0x56A535A: epoll_event_loop_once (tevent_epoll.c:926)
==3636== by 0x56A3826: std_event_loop_once (tevent_standard.c:114)
==3636== by 0x569FFFC: _tevent_loop_once (tevent.c:533)
==3636== by 0x56A019A: tevent_common_loop_wait (tevent.c:637)
==3636== by 0x56A37C6: std_event_loop_wait (tevent_standard.c:140)
==3636== by 0x11E03A: ctdb_start_daemon (ctdb_daemon.c:1320)
==3636== by 0x118557: main (ctdbd.c:321)
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
For all the records listed in VACUUM_FETCH, migration requests are sent
immediately without waiting. This means there can only be a single
VACUUM_FETCH processing active at a time.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
With the processing of one element factored out,
it is more natural to have the actual loop inside the
handler function. This also makes the talloc/free
bracked around the loop more obvious.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This means that DisableIPFailover will be set if it should be.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If DisableIPFailover is set then something else may be managing public
IP addresses so CTDB should leave them alone.
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There won't be an IP tree. It is only ever initialised during a
takeover run.
The alternate to this would be to avoid sending
CTDB_SRVID_RECD_UPDATE_IP in "ctdb moveip". This logic is probably
best kept out of the CLI tool.
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The potential for public IP addresses to shuffle around during node
initialisation disappeared a while ago because IP addresses can only
be assigned to a node that is in CTDB_RUNSTATE_RUNNING. This means
that interfaces might as well just be initialised as "up". If any
interfaces are actually "down" then this will be rectified by the
"startup" event in 10.interfaces.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
To support external failover of IP addresses if DisableIPFailover is
set. CTDB's idea of IP address assignment can be manipulated using
"ctdb moveip". Checking if the IP address is already held breaks
this in several places.
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
ctdb_sys_have_ip() should only be run if if do_publicipcheck is set.
This is clearer if written as 2 nested if-statements rather than as a
lazy conjuction.
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Sun May 10 06:10:21 CEST 2015 on sn-devel-104
Don't initialise it after ctdb_event_script_callback_v() may have
short-circuited. This can stop ctdb_event_script_args() from ever
terminating.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
status.done is never set to true unless event_script_callback() is
invoked. The short-circuit in ctdb_event_script_callback_v() means
that this doesn't happen. CTDB can't work very well without 00.ctdb
(for tunable initialisation and the like) but it shouldn't get stuck.
So call the callback when there are no scripts in
event_script_callback().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is done by 10.interace where the monitor event fails when there
is a missing interface. The in-daemon interface checking adds no
value.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If all nodes are still in, say, FIRST_RECOVERY runstate, then the logs
contain unfortunate noise like:
recoverd:Failed to find node to cover ip 10.0.2.131
This avoids that by adding an early exit that avoids running
takeover_run_core() when there are no nodes in the
CTDB_RUNSTATE_RUNNING.
To support this add the runstate to the ipflags structure. There are
clearly other ways of hacking this but this seems the simplest.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It isn't possible to hold the recovery lock without having a lock file
set.
This is part of a goal to generalise the recovery lock mechanism to
just use a helper program, which may use a lock file or may use
something else.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Election packets from the current node are ignored at the beginning of
the function, so this does not need to be checked.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
No need to just send it to the recovery master.
This reduces the need for main daemon code to know which node is the
recovery master. The end goal is for the main daemon to not need to
know which node is the recovery master - this information would be
stored in the recovery daemon (and subsequently a separate cluster
management daemon).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Databases are only pulled by the recovery master, so it can compare
with current node PNN.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Unless this node is the recovery master then this is not needed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It isn't used anywhere else and is always re-initialised to 0.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
They are initialised and updated but the values are never used.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Simplify update_capabilities() using the capabilities API and store
the capabilities in new field rec->caps rather than scattered around
ctdb->nodes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This reverts commit 39d2fd330a.
An election can occur in the middle of a recovery. During the
election the recovery master can change. When a node loses a round of
the election and stops being the recovery master it releases the
recovery lock. Then at the end of the ongoing recovery all nodes are
able to take the recovery lock so they will all abort.
The most likely cause for a change in recovery master is that several
(all?) nodes are starting up and the "connected-ness" of each node is
a primary factor in winning the election. In this situation the
recovery master can bounce around the cluster.
The simplest solution is to revert this patch so that the recovery
will fail. The new recovery master will then start a new recovery.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon May 4 10:40:36 CEST 2015 on sn-devel-104
Due to usage of CTDB_NO_MEMORY macro,
some of the resources are not freed in failure cases.
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Guenther Deschner <gd@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Günther Deschner <gd@samba.org>
Autobuild-Date(master): Fri Apr 17 16:49:05 CEST 2015 on sn-devel-104