IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
1) It's buggy. Code needs to be carefully written (ie. no busy
loops) to handle running with it, and we fork and run scripts.[1]
2) It makes debugging harder. If ctdbd loops (as has happened recently)
it can be extremely hard to get in and see what's happening. We've already
seen the valgrind hacks.
3) We have seen recent scheduler problems. Perhaps they are unrelated,
but removing this very unusual setup is unlikely to hurt.
4) It doesn't make anything faster. Under all but the most perverse of
circumstances, 99% of the cpu gives the same performance as 100%, and
we will always preempt normal processes anyway.
[1] I made this worse in 0fafdcb8d353 "eventscript: fork() a child for
each script" by removing the switch_from_server_to_client() which
restored it, but even that was only for monitor scripts. Others were
run with RT priority.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 482c302d46e2162d0cf552f8456bc49573ae729d)
The do_setsched was being tested for whether to mmap tdbs: let's make it
explicit. We can also happily move the kill-child eventscript hack under
this flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 2ee86cc1f311d7b7504c7b14d142b9c4f6f4b469)
We also no longer return an error before scripts have been run; a special
zero-length data means we have never run the scripts.
"ctdb scriptstatus all" returns all event script results.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 9b90d671581e390e2892d3a68f3ca98d58bef4df)
We're going to allow fetching status of all script runs, so this
name is no longer appropriate.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit f5cb41ecf3fa986b8af243e8546eb3b985cd902a)
The child no longer uses ctdb_ctrl_event_script_init or
ctdb_ctrl_event_script_finished, and the others are redundant: it
doesn't need to tell us it's starting a script when it only runs one.
We move start and stop calls to the parent, and eliminate the RPC
infrastructure altogether.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 391926a87a7af73840f10bb314c0a2f951a0854c)
This unifies code paths and simplifies things: we just hand -ENOEXEC to
ctdb_ctrl_event_script_stop().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit eadf5e44ef97d7703a7d3bce0e7ea0f21cb11f14)
To cope with timeouts when recoveries and transactions collide.
Maybe 100 is too high.
Michael
(This used to be ctdb commit c23d804165e84bdf95ba960c953c736d361011d7)
So that it is correctly handled by recoveries.
Also explicitly set the dmaster field to the current node's pnn.
Michael
(This used to be ctdb commit 03a5bb727b9db1ba952632f08ceb5355f0df842d)
The gap that remained is between checking whether a transaction commit
is in progress and taking the lock. Now we first take the lock and then
check whether a transaction commit is in progress. If so, we release the
lock, wait for one second and retry.
Michael
(This used to be ctdb commit b95524c08bf12914120cb6c818ecc1c99738fe37)
* add __location__
* wrap overly long line
* print unsigned ints as unsigned (reqid, opcode, destnode)
Michael
(This used to be ctdb commit 6b47ea111867c845974aa2687a658ebca2854816)
In ctdb_transaction_commit(), when the trans2_commit control fails, there
is a race condition in the 1 second sleep between the local transaction_cancel
and the call to ctdb_replay_transaction(): The database is not locked, and
neither is the transaction_lock record. So another client can start and possibly
complete a new transaction in this gap, but only on the same node: The locking
of the transaction_lock record on a different node which involves migration of
the record to the other node has been disabled by introduction of the
transaction_active flag on the db which closes precisely this gap from the start
of the commit until the call to TRANS2_FINISH or TRANS2_ERROR.
But this mechanism does not cover the case where a process on the same node
tries to start a transaction: There is no obstacle to locking the transaction_lock
record because the record does not need to be migrated.
This commit closes this race condition in ctdb_transaction_fetch_start()
by using the new ctdb_ctrl_transaction_active() call to ask the local
ctdb daemon whether it has a transaction running on the database.
If so, the check is repeated until the running transaction is done.
This does introduce an additional call to the local ctdbd when starting
transactions, but it does close the (hopefully) last race condition.
Michael
(This used to be ctdb commit 02ee9dfd3c6b09f5c5172a7e38738c20b7f0aecd)
add a global variable holding the pid of the main daemon.
change the tracking of time() in the event loop to only check/warn when called from the main daemon
(This used to be ctdb commit a10fc51f4c30e85ada6d4b7347b0f9a8ebc76637)
database priorities will be used to control in which order databases are locked during recovery in.
(This used to be ctdb commit 67741c0ee01916d94cace8e9462ef02507e06078)
There are two races in concurrent transactions on a single node.
One in starting a transaction, and one with committing (replaying).
This commit closes the first race by storing the pid in the
transaction-lock record and comparing the own pid against it
as a measure to prevent starting a second transaction when
a second node has come inbetween and changed the pid in the lock
record.
Michael
(This used to be ctdb commit 84e5a55a900b01903b80e23045edfc726d8d77a1)