1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-12 09:18:10 +03:00
samba-mirror/ctdb
Michael Adam 4cd06a330e Fix persistent transaction commit race condition.
In ctdb_client.c:ctdb_transaction_commit(), after a failed
TRANS2_COMMIT control call (for instance due to the 1-second
being exceeded waiting for a busy node's reply), there is a
1-second gap between the transaction_cancel() and
replay_transaction() calls in which there is no lock on the
persistent db. And due to the lack of global state
indicating that a transaction is in progress in ctdbd, other nodes
may succeed to start transactions on the db in this gap and
even worse work on top of the possibly already pushed changes.
So the data diverges on the several nodes.

This change fixes this by introducing global state for a transaction
commit being active in the ctdb_db_context struct and in a db_id field
in the client so that a client keeps track of _which_ tdb it as
transaction commit running on. These data are set by ctdb upon
entering the trans2_commit control and they are cleared in the
trans2_error or trans2_finished controls. This makes it impossible
to start a nother transaction or migrate a record to a different
node while a transaction is active on a persistent tdb, including
the retry loop.

This approach is dead lock free and still allows recovery process
to be started in the retry-gap between cancel and replay.
Also note, that this solution does not require any change in the
client side.

This was debugged and developed together with
Stefan Metzmacher <metze@samba.org> - thanks!

Michael

(This used to be ctdb commit f88103516e5ad723062fb95fcb07a128f1069d69)
2009-07-29 11:12:39 +10:00
..
client client: set dmaster in ctdb_transaction_store() also when updating an existing record 2009-07-29 10:28:35 +10:00
common When we dispatch a message to a handler, pass the data as a real talloc object so that the handler can talloc_steal() the message content. 2009-07-02 12:58:49 +10:00
config update the natgw eventscript to set the NATGW capability when this feature is used 2009-07-28 10:00:33 +10:00
doc document the two new commands setlmasterrole and setrecmasterrole 2009-07-28 13:54:08 +10:00
ib Whitespace changes and using the CTDB_NO_MEMORY() macro changes to 2009-05-21 11:49:16 +10:00
include Fix persistent transaction commit race condition. 2009-07-29 11:12:39 +10:00
lib New attempt at TDB transaction nesting allow/disallow. 2009-05-25 17:04:42 +10:00
packaging new version 1.0.87 2009-07-17 13:01:11 +10:00
server Fix persistent transaction commit race condition. 2009-07-29 11:12:39 +10:00
tcp make it possible to start the daemon in STOPPED mode 2009-07-09 11:57:20 +10:00
tests new version 1.0.87 2009-07-17 13:01:11 +10:00
tools When processing the stop node control reply in the client code we should 2009-07-29 09:58:40 +10:00
utils From William Jojo <w.jojo[AT]hvcc.edu> 2009-06-04 09:41:05 +10:00
web fix the git path to the repository 2009-05-25 12:15:13 +10:00
.bzrignore more code rearrangement 2007-06-07 22:16:48 +10:00
.gitignore From Mathieu Parent <math.parent@gmail.com> 2009-04-08 09:21:11 +10:00
aclocal.m4 initial version 2006-11-18 10:41:20 +11:00
autogen.sh From Mathieu Parent <math.parent@gmail.com> 2009-04-08 09:21:11 +10:00
config.guess more merges for GPLv3 update 2007-07-10 15:46:05 +10:00
config.mk minor back-merge from samba4 2007-07-10 18:13:47 +10:00
config.sub more merges for GPLv3 update 2007-07-10 15:46:05 +10:00
configure.ac remove the obsolete ipmux component. 2009-05-25 12:33:52 +10:00
configure.rpm fixed permissions on configure.rpm 2008-04-22 16:48:25 +02:00
COPYING add a licence file 2009-02-07 08:10:34 +11:00
ctdb.pc.in (This used to be ctdb commit b0718551f55d5da9be0e6aba233f57c1ff8509be) 2009-04-08 09:14:20 +10:00
install-sh initial version 2006-11-18 10:41:20 +11:00
Makefile.in rename 99.routing to 11.routing so that it executed before the service scripts 2009-06-23 11:29:26 +10:00