IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
We know ask for the known and available interfaces.
This means a node gets a RELEASE_IP event for all interfaces
it "knows", but doesn't serve and a node only gets a TAKE_IP event
for "available" interfaces.
metze
(This used to be ctdb commit a695a38e49e7c3e15a9706392dc920eeab1f11ba)
With this script it's possible to generate routing tables
per public ip address.
metze
(This used to be ctdb commit ff5678fbec2daef461143acf00cef3f94d7655fc)
When two releaseip events run in parallel it's possible that the 2nd script
readds a secondary ip that was removed by the 1st script.
metze
(This used to be ctdb commit e02417b2a55c45ac2c125b1b3463c9c39e7bc07a)
This is needed because the "startup" event runs after the initial recovery,
but we need to do some actions before the initial recovery.
metze
(This used to be ctdb commit e953808449c102258abb6cba6f4abf486dda3b82)
This has curently no affect on the generated configure and config.h.in files.
metze
(This used to be ctdb commit 9d39ada437b02d84b70a5fea78b61bbb32f16d81)
This finished commit a78b8ea7168e5fdb2d62379ad3112008b2748576.
The logic was missing in events_standard (the one that's used by default).
metze
(This used to be ctdb commit 49f0488a5e60c74b6b8361dffcd09ebb2db740f0)
configureable using --log-ringbuf-size=<num-entries>.
Add an entry in the sysconfig file to set this persistently.
(This used to be ctdb commit c79c2da69bc352f509e7fca4b9172a4b7f23c0f8)
This reverts commit 7c95e56ba871a4e0cb893a5cb5d821e7ff6e6dd6.
wbinfo --ping-dc is proving too unreliable.
(This used to be ctdb commit b70021856e76df1ba407c83cfc19bf332fbfc869)
This reverts commit 7b73834ba3ac197cc8a3020c111f9bb2c567e70b.
wbinfo --ping-dc is proving too unreliable.
(This used to be ctdb commit 178f429a7b6d1008d35e857b6ca1df6adb60d255)
Another corner case when we terminate running monitor scripts to run
something else: logging can flush the output and we write to a NULL
pointer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit eb22c34bccc8a04fcf63efa2bc48d9788709382e)
(Reapplied with merge after accidental revert)
Previously we updated cb_status a each script finished. Since we're storing
the status anyway, we can calculate it by iterating the scripts array
itself, providing clear and uniform behavior on all code paths.
In particular, this fixes a longstanding bug when we abort monitor
scripts to run some other script: the cb_status was uninitialized. In
this case, we need to hand *something* to the callback; 0 might make
us go healthy when we shouldn't. So we use the last status (normally,
this will be the just-saved current status).
In addition, we make the case of failing the first fork for the script
and failing other script forks the same: the error is returned via the
callback and saved for viewing through 'ctdb scriptstatus'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 2c84fe393ff2b961abf77d58a371c24db5ecb93b)
We shouldn't set ctdb->current_monitor until we set destructor: that's
what cleans it up.
Also, free state->scripts on no-scripts exit path: it's not a child of
state because we need it in the destructor.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 843a2ed5ef85f628788b0caf7417c6b61b5c6d3f)
Previously we updated cb_status a each script finished. Since we're storing
the status anyway, we can calculate it by iterating the scripts array
itself, providing clear and uniform behavior on all code paths.
In particular, this fixes a longstanding bug when we abort monitor
scripts to run some other script: the cb_status was uninitialized. In
this case, we need to hand *something* to the callback; 0 might make
us go healthy when we shouldn't. So we use the last status (normally,
this will be the just-saved current status).
In addition, we make the case of failing the first fork for the script
and failing other script forks the same: the error is returned via the
callback and saved for viewing through 'ctdb scriptstatus'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 5d50f0e16948d18009f6623f132113f7273efc7f)
We don't want ctdb stalling due to paging; this can be far worse than
scheduling delays. But if we simply do mlockall(MCL_FUTURE), it
increases the risk that mmap (ie. tdb open) or malloc will fail,
causing us to abort.
This patch is a compromise: we mlock all current pages (including
10k of future stack for expansion) and then relock when a client
asks us to open a TDB. We warn, but don't exit, if it fails.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 82f778e85440bc713d3f87c08ddc955d3cfce926)
1) It's buggy. Code needs to be carefully written (ie. no busy
loops) to handle running with it, and we fork and run scripts.[1]
2) It makes debugging harder. If ctdbd loops (as has happened recently)
it can be extremely hard to get in and see what's happening. We've already
seen the valgrind hacks.
3) We have seen recent scheduler problems. Perhaps they are unrelated,
but removing this very unusual setup is unlikely to hurt.
4) It doesn't make anything faster. Under all but the most perverse of
circumstances, 99% of the cpu gives the same performance as 100%, and
we will always preempt normal processes anyway.
[1] I made this worse in 0fafdcb8d353 "eventscript: fork() a child for
each script" by removing the switch_from_server_to_client() which
restored it, but even that was only for monitor scripts. Others were
run with RT priority.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 482c302d46e2162d0cf552f8456bc49573ae729d)
The do_setsched was being tested for whether to mmap tdbs: let's make it
explicit. We can also happily move the kill-child eventscript hack under
this flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 2ee86cc1f311d7b7504c7b14d142b9c4f6f4b469)
Depending on --max-persistent-check-errors we allow ctdb
to start with unhealthy persistent databases.
The default is 0 which means to reject a startup with
unhealthy dbs.
The health of the persistent databases is checked after each
recovery. Node monitoring and the "startup" is deferred
until all persistent databases are healthy.
Databases can become healthy automaticly by a completely
HEALTHY node joining the cluster. Or by an administrator
with "ctdb backupdb/restoredb" or "ctdb wipedb".
metze
(This used to be ctdb commit 15f133d5150ed1badb4fef7d644f10cd08a25cb5)
All other scripts do 'loadconfig ctdb' before any other 'loadconfig foo'
call. I think we should do the same in statd-callout.
Otherwise it's very confusing, if you have configured some Options
in /etc/sysconfig/ctdb, but /etc/ctdb/statd-callout doesn't notice
them.
metze
(This used to be ctdb commit 10d95581fb90bfdf58ec32345c4e36c27acf4f37)
(cherry picked from commit 4334092cba)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 093f57a2c00f2d629a3b58e58202f1a7e1bbd406)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(cherry picked from samba commit 9776cb0345)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit d1873bd81bfc9f486b88f3a38c65c7de8f5a0909)
metze
(cherry picked from samba commit 5ca0a4bfd6)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 04aeac728f56c65b973762f09977de1b1b99099e)
We need to keep TDB_ALLOW_NESTING as default behavior,
so that existing code continues to work.
However we may change the default together with a major version
number change in future.
metze
(cherry picked from samba commit 3b9f19ed91)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit c1c0ede32dc00ed619d1cf5fda40a9de43995f3a)
metze
(cherry picked from samba commit 85449b7bcc)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 855391c1e37012b0d6c673a304bb8da8a1efcd72)
While studying tdb, I've noticed a couple of mismatches between readme
and actual code:
- tdb_open_ex changed it's log_fn argument to log_ctx
- there is now no tdb_update(), which it seems was transformed into
non-exported tdb_update_hash()
There were other mismatches, but I don't remember them now, sorry.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 83de5c8263)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 7a88f1df9190674deaf5dcbedad02ae4120a5263)
The reason I do it is that when using older python-tdb as shipped in
Debian Lenny, python interpreter crashes on this test:
(gdb) bt
#0 0xb7f8c424 in __kernel_vsyscall ()
#1 0xb7df5640 in raise () from /lib/i686/cmov/libc.so.6
#2 0xb7df7018 in abort () from /lib/i686/cmov/libc.so.6
#3 0xb7e3234d in __libc_message () from /lib/i686/cmov/libc.so.6
#4 0xb7e38624 in malloc_printerr () from /lib/i686/cmov/libc.so.6
#5 0xb7e3a826 in free () from /lib/i686/cmov/libc.so.6
#6 0xb7b39c84 in tdb_close () from /usr/lib/libtdb.so.1
#7 0xb7b43e14 in ?? () from /var/lib/python-support/python2.5/_tdb.so
#8 0x0a038d08 in ?? ()
#9 0x00000000 in ?? ()
master's pytdb does not (we have a check for self->closed in obj_close()),
but still...
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 71a21393dd)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 03372b4ea8ba2938468a5c0fc234d604966ce070)
So that erroneous double tdb_close() calls do not try to close() same
fd again. This is like SAFE_FREE() but for fd.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit b4424f8234)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit f5c992bdaeb73ef726ff4728a9922721474cd6f5)
It's Tdb.get(), not Tdb.fetch().
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit cfed5f946d)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 76aacdd8e1106f26565e25903091a757b59cd7e2)
This can help with ldb where we rewrite the index records
(cherry picked from samba commit d4c0e8fdf0)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 470750fa2e3cf987f10de48451b1ee13aab03907)
metze
(cherry picked from samba commit 3b62e250c0)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 03b3682e3fa53c9f5fdf2c4beac8b5d030fd2630)
Also, set logging function so we get more informative messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 0944931159)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 6ac7ef8bf4d384f880c7f483ace70f8e08c15a8b)
ctdb wants a quick way to detect corrupt tdbs; particularly, tdbs with
loops in their hash chains. tdb_check() provides this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 022b4d4aa6)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit df1a3ce0380fa9d8722b2f9b16f65557095e4c83)
We no longer use swig for pytdb, so there is no need for swig make
rules. Also pytdb.c header should be updated.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit ecbe5ebd8d)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 27611d6a0c313732e438cb24c82b9de126e50156)
This fixes the build on Tru64.
metze
(cherry picked from samba commit 3718cf294a)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 5652e403be099f35cdd29fda8ba4fe2c35de8035)
With the ctdb checkin dde9f3f006 tdb optimized out write lock checks for
write-enabled transaction. Sadly, this also removed the possibility to ever
remove dead records left over from tdb_delete calls within a transaction.
Tridge, please check this! Did dde9f3f006 have any reason beyond performance
optimizations?
Thanks,
Volker
(cherry picked from samba commit 3f884c4ae3)
(This used to be commit 1d85e0647e)
(cherry picked from samba commit 8c88209c6f)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit b02bf7659f04f1fa203834bd75a2392b48e56c16)
Found by cppcheck:
[lib/tdb/tools/tdbtorture.c:326]: (error) Memory leak: pids
(cherry picked from samba commit 497b9e460b)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 5d4cc4b018a538dc3f1d79fe091f3e6e67003daf)
tdbbackup was originally written before we had transactions, and it
attempted to use its own fsync() calls to make it safe. Now that we
have transactions we can do it in a much safer (and faster!) fashion
(cherry picked from samba commit 2e4247782b)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit cd23d36ada9631095ca68663516de0c8d8c3bbed)
This means you can kill it at any time and expect no corruption.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 0fc6800005)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit f7278a277ed91587cae5b5e3660dad7124bdb73f)
It was a regrettable hack which I used to reduce line count in tdb; in fact it caused confusion as can be seen in this patch.
In particular, ecode now needs to be set before TDB_LOG anyway, and having it exposed in
the header is useless (the struct tdb_context isn't defined, so it's doubly useless).
Also, we should never set errno, as io.c was doing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit b77f41d58b)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit a6620f6e74aadc708395b21b42303d1082192fcc)
When TDB_TRACE is defined (in tdb_private.h), verbose tracing of tdb operations is enabled.
This can be replayed using "replay_trace" from http://ccan.ozlabs.org/info/tdb.
The majority of this patch comes from moving internal functions to _<funcname> to
avoid double-tracing. There should be no additional overhead for the normal (!TDB_TRACE)
case.
Note that the verbose traces compress really well with rzip.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 703004340c)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit b01b756cb577f32a1ec4597efb00017241e01685)
There was a race condition that caused the torture.tdb to be left in a
state that needed recovery. The torture code thought that any message
from the tdb code was an error, so the "recovered" message, which is a
TDB_DEBUG_TRACE message, marked the run as being an error when it
isn't.
(cherry picked from samba commit 5dcf0069b6)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 43c97b259b19c42b4edc7f83dbfc5e486568b4e3)
Michael
(cherry picked from samba commit e440a2e11e)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit c1b8d32b4ef87b9d8f37b451f47fcee2ea753d21)
This adds 3 simple speed tests to tdbtool, for transaction store,
store and fetch.
On my laptop this shows transactions costing about 10ms
(cherry picked from samba commit e15027155d)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 463279c972fa4538919bdd1dff48ca6b2fb8d49c)
So one can perform tdbtool operations protected by transactions.
Michael
(cherry picked from samba commit 91e1bab2e9)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 35a5b874b925380f7c227e47aebb590c9db4739e)
Michael
(cherry picked from samba commit 817383d88d)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit dc287a7d7420cca0b104049e689a73202bc535f8)
We previously only allowed a commit to happen after a prepare
commit. It is in fact safe to allow reads between a prepare and a
commit, and the s4 replication code can make use of that, so allow it.
(cherry picked from samba commit 46c99ec2a3)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 5ef5ddb8369e5e76173285fe9a08498dc8dc73ab)
Michael
(cherry picked from samba commit 55dcf928eb)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit ef1dc585d869a9e48164cd65bafc92c1da245007)
Michael
(cherry picked from samba commit cfa4e7ec75)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 0ae735b7a2096a40e5e47086ec41d9d45ef6d36b)
Michael
(cherry picked from samba commit 25939a627f)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 2e69647404c87c438ae7c180277ac3b532941efd)
by first concatenating multilint parentheses and removing typefes afterwards.
Michael
(cherry picked from samba commit 13bfcd5a93)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 37225f1ed3f70d7259c2af2c51c671105c34476a)
Michael
(cherry picked from samba commit ecd12bfb38)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 66fffa577e051212ac7541be906b6c80f4a7c0c9)
Michael
(cherry picked from samba commit 400f08450b)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 310d673b7cb9000d76437d78e43bc2bf133e4e14)
Michael
(cherry picked from samba commit 907e05595f)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit f70e371d70e334a7254649b2bb09aa382e6f09bb)
Guenther
(cherry picked from samba commit 1c2f4919ab)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 9d5015e6fc68d3eb9e7b7178dbaf8c129dc79471)
USAGE: abi_checks.sh LIBRARY_NAME header1 [header2 ...]
This creates symbol signature lists using the mksyms and mksigs scripts
and compares them with the checked in lists.
Michael
(cherry picked from samba commit 9636e0d373)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 724d71dc838750fff91a45359feeb6e71bf0a4c7)
This produces output like the output gcc produces when
invoked with the -aux-info switch.
Run like this: cat include/tdb.h | ./script/mksigs.pl
This simple parser is probably too coarse to handle all
possible header files, but it treats tdb.h correctly...
Michael
(cherry picked from samba commit 0760a04ef9)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 141422d9dc24b15b7b8bc7831adab90367a729f7)
Michael
(cherry picked from samba commit 006fd0c43c)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit aed864dceaf6ec1e6e6066a587c708b485901200)
In future, this may happen, and we don't want to clobber them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 398d0c2929)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit eebd467961dad6cfb38c2a5d6e4b4dbf86e55e63)
On systems with 32 bit offsets, expansion and fcntl locking on these records
will fail anyway. SAMBA already does '#define _FILE_OFFSET_BITS 64' in
config.h (on my 32-bit x86 Linux system at least) to get 64 bit file offsets.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 252f7da702)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 2d768f664e6db65b3b7e0c732f33ee2b806892f9)
This reverts commit 401f421fa003d9515df15e759b50b56e0c67d69c.
Conflicts:
include/ctdb_private.h
server/ctdb_tunables.c
(This used to be ctdb commit b883d19a495a41a22db37f9c2cf6250fee529de0)
Date: Tue Dec 15 15:53:30 2009 +1030
eventscript: hack to avoid overloading valgrind
Now we fork one child per script, when running under valgrind the
load
gets quite high. This is because valgrind does a lot of work after
exit,
and we don't wait for the children to finish; we start the next one
when
the child reports status via the pipe.
This fix is ugly, but simple.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 6ed34d5320c39d8a55f2a36ad4c1ab574e0b0796)
I saw once where the master ctdbd logging structure was talloc freed
which caused issues.
So only free the structure if it is NOT the master structure.
This needs to be looked into in more detail.
(This used to be ctdb commit bcf494b81f4277dc75f05faccf0c446bd15f6e2b)
or else we can crash if we receive log messages from a child but the log structure has been freed()
(This used to be ctdb commit ea9e39369379939abf6a4076fa2014c10c1a9ad0)
Subject: eventscript: fix spinning at 100% cpu when child exits.
ctdbd was spinning reading 0 from a pipe, as soon as the first
eventscript finishes.
This was caused by the intersection between a78b8ea7168e "Run only one
event for each epoll_wait/select call" and 32cfdc3aec34 "eventscript:
ctdb_fork_with_logging()". Unavoidable mid-air collision, since both
worked fine and both were developed simultaneously.
When the script exits, we have two pipes open to it: one for any
stdout/stderr for logging (ctdb_log_handler), and one for the result
(ctdb_event_script_handler). The latter frees everything, including
the log fd and event structure.
We used to get one callback to ctdb_log_handler, which got a harmless
0-length read, then one to ctdb_event_script_handler which cleaned up.
Now we only do one callback per poll, we need the logging function to
clean itself up so we can make process.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 211ea7907e8e96041aa6f7d086551d64d065a8a3)
since we no longer ban nodes when dodgy scripts continue to hang.
We now only mark nodes as unhealthy if monitor events fail or timeout. Never ban.
(This used to be ctdb commit 5c8e56fc7a518e115bceac257867739283cf6a1e)
In other news, did you know ctime() returns a \n-terminated string?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 1b4e7bb548976b99f122142b040494b6f9911962)
Commit c1ba1392fe "eventscript: get rid of ctdb_control_event_script_finished
altogether" was wrong: there is one case where we want to free the script
without transferring their status to last_status. This happens because we
always kill an running monitor command when we run any other command.
This still isn't quite right (and never was): the callback will be called
with status value 0, which might flip us to HEALTHY if we were unhealthy.
This is conveniently fixed in my next set of patches :)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 0ea0e27d93398df997d3df9d8bf112358af3a4a5)
there is no rational need for a setting where we permanently mark nodes as disabled everytime an eventscript fails
(This used to be ctdb commit 68a8ee99b128a5ec883600735626bdb3bbc9c503)
This reverts commit 8aef46d2aab3efb322dda51eaa202653cefd5222.
This special recovery logic is wrong now with the transaction rewrite.
The treatment of persistent databases will later be rewritten to use the
database sequence number.
Michael
(This used to be ctdb commit c5a0aef668a63f927d6184612b13ce316eb4a0be)
This patch improves the handling of the fetch_lock operation on non-persistent
databases that ctdb clients have to do very frequently.
The normal flow how this goes is the following:
1. Client does a local fetch_lock on the database
2. Client looks if the local node is dmaster.
If yes, everything is fine
If no, continue here
3. Client unlocks the local record
4. Client issues a "get me the record" call to ctdbd
5. ctdbd goes out and fetches the dmaster role
6. ctdbd tells the client to retry
7. Client starts over again
The problem is between step 6 and 7: Before the client has had the chance to
retry (i.e. catch the record with a fetch_locked), another node might have come
asking ctdbd to migrate away the record again. This is a real problem, I've
seen >20 loops of this kind in real workloads.
This patch does the following: Whenever ctdb receives a record as result of
step 5, it puts the key on a "holdback list". As long as a key is on this list,
a request to migrate away the dmaster is put on hold. It is the client's duty
to issue the "CTDB_CONTROL_GOTIT" control when it has successfully done step 2
after having asked ctdb to fetch the record. This will release the key from the
"holdback list" and re-issue all dmaster migration requests.
As a safeguard against malicious clients, once a second (default 1000msecs,
tunable "HoldbackCleanupInterval" in milliseconds) ctdbd goes over the list of
held back keys, deletes them and releases all held back migration requests.
(This used to be ctdb commit 5736e17c139c9a8049e235429aeae0c6c9d0e93d)
Make it return success for make test.
This is temporarily disabled until the rewrite of the
transaction code (in samba and the daemon) using the global
lock feature has been ported to the ctdb client code.
Michael
(This used to be ctdb commit 78ca29352aa39f4ef4e41096b92d55cb2e0d348a)
This is a simplified version of the trans2 commit control:
It just rolls out the marshall buffer to all active nodes.
It is the main ctdbd part of the re-implementation of the
persistent transactions. The client code is changed to
take a global lock to start a transactions and store into
the marshal buffer instead of writing to the local tdb
under a local transaction.
The old transaction implementation is going to be
removed in a later commit.
Michael
(This used to be ctdb commit f66428f9d2013080a414404c1ba6117888352fd6)
Date: Wed, 9 Dec 2009 22:45:12 +0100
Subject: [PATCH] Revert an accidential commit
(This used to be ctdb commit af6656f2844d8fd72204a70358c9d589dbe1bd34)
Writes without transaction are not possible any more on
persistent databases.
Michael
(This used to be ctdb commit 59f46d7261dfdbdef900bf95dd9eb28ad22a46b2)
This is useless now that persistent write operations without
transaction are forbidden.
Michael
(This used to be ctdb commit b022863d44026c19d5aae54aa485b670bea0540e)
This is useless now that persistent writes without transactions are forbidden.
Michael
(This used to be ctdb commit 9ac82311d796e1fab31f8de62b8ccc754445093c)
This is like the 53_ctdb_transaction test, but it additionally
runs a loop with recoveries while the transactions are running.
When called like this, the transaction loops run for 10 minutes:
CTDB_TEST_TIMELIMIT=600 tests/scripts/run_tests tests/simple/54_ctdb_transaction_recovery.sh
The default timelimit is 30 seconds.
Michael
(This used to be ctdb commit 2ff2679e8f3d50ebf735f2c420898a84268bdc95)
This gets just too noisy on a busy system.
And it is purley informational anyways...
Michael
(This used to be ctdb commit 7f64a00c76203fdf6673c3f862a4bfd17fb848d7)
This might be a bit less efficient, but experience in winbind has shown that
event callbacks can trigger changes in the socket state in very hard to
diagnose ways.
(This used to be ctdb commit a78b8ea7168e5fdb2d62379ad3112008b2748576)
syslog.h says:
LOG_NOTICE 5 normal but significant condition
LOG_INFO 6 informational
several vacuuming related logs logged at NOTICE level although I don't see
any real significance, these are just informational messages for me
Signed-off-by: Christian Ambach <christian.ambach@de.ibm.com>
(This used to be ctdb commit 142111983c103e90ccccbe26fd580c4eb28e949f)
add the __location__ macro to the logs to get a better idea
in which loop the problem occured
Signed-off-by: Christian Ambach <christian.ambach@de.ibm.com>
(This used to be ctdb commit dccb549fd6a6e338063699544e52f2a1a6a966b5)
when checking link status for an interface, first
check if this interface is in fact a bond device
(by the precense of a /proc/net/bonding/IFACE file)
and use that file for checking status.
Othervise assume ib* is an infiniband interface which we donnt know how
to check, or otherwise it is an ethernet interface and ethtool should
hopefully work.
(This used to be ctdb commit 8cc6c5de3d7abb0b72eaa6e769e70963b02d84cb)
We also no longer return an error before scripts have been run; a special
zero-length data means we have never run the scripts.
"ctdb scriptstatus all" returns all event script results.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 9b90d671581e390e2892d3a68f3ca98d58bef4df)
We're going to need this so ctdb can query non-monitor status.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 53bc5ca23ca55a3ac63a440051f16716944a2a51)
Ronnie suggested this; seems like a very good idea.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 93153bca68926401dc9ae7fd77ed3f17be923344)
We always have to call it before freeing the state; we should just do
this work in the destructor itself.
Unfortunately, the script state would already be freed by the time
the state destructor is called, so we make the script state a child of
ctdb, and talloc_free() it manually on the one path which doesn't use
the destructor.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit c1ba1392fe52762960e896ace0aca0ee4faa94d5)
Rather than only tranferring to last_status for monitor events, do
it for every event (ctdb->last_status is now an array).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit c73ea56275d4be76f7ed983d7565b20237dbdce3)
We only need ctdb->current_monitor so we can kill it when we want to run
something else; we don't need to use it here as we always know what script
we are running.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 4cf1b7c32bcf7e4b65aec1fa7ee1a4b162cac889)
The only difference between the exposed an internal structure now is
that the name and output fields were pointers. Switch to using
ctdb_scripts_wire/ctdb_script_wire internally as well so marshalling
is a noop.
We now reject scripts which are too long and truncate logging to the
511 characters we have space for (the entire output will be in the
normal ctdbd log).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit fd2f04554e604bc421806be96b987e601473a9b8)
We're going to allow fetching status of all script runs, so this
name is no longer appropriate.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit f5cb41ecf3fa986b8af243e8546eb3b985cd902a)