IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This is to cope with timeouts when recoveries and transactions collide.
Maybe 100 is too hight, but 10 or even 20 have been too low in a
very busy environment.
Michael
In ctdb_transaction_commit(), when the trans2_commit control fails, there
is a race condition in the 1 second sleep between the local transaction_cancel
and the call to ctdb_replay_transaction(): The database is not locked, and
neither is the transaction_lock record. So another client can start and possibly
complete a new transaction in this gap, but only on the same node: The locking
of the transaction_lock record on a different node which involves migration of
the record to the other node has been disabled by introduction of the
transaction_active flag on the db which closes precisely this gap from the start
of the commit until the call to TRANS2_FINISH or TRANS2_ERROR.
But this mechanism does not cover the case where a process on the same node
tries to start a transaction: There is no obstacle to locking the transaction_lock
record because the record does not need to be migrated.
This commit closes this race condition in ctdb_transaction_fetch_start()
by using the new ctdb_ctrl_transaction_active() call to ask the local
ctdb daemon whether it has a transaction running on the database.
If so, the check is repeated until the running transaction is done.
This does introduce an additional call to the local ctdbd when starting
transactions, but it does close the (hopefully) last race condition.
Michael
There are two races in concurrent transactions on a single node.
One in starting a transaction and one with replay during commit.
This commit closes the first race by storing the client pid in the
transaction-lock record and comparing the stored pid against its own
pid after releasing the lock and refetching the record inside the
transaction.
Michael
1) when all nodes write the same value to the record, or when writing
a value that is already there, we can skip the write and save
ourselves a network transactions
2) when all remote nodes fail an update, and we then fail a replay, we
don't need to trigger a recovery. This solves a corner case where
we could get into a recovery loop
(This used to be commit 2481bfce4307274806584b0d8e295cc7f638e184)
This is because ctdbd can fail in performing the persistent_store
due to race conditions, and this does not mean it can't succeed
the next time.
To not loop infinitely, this makes use of a new parametric option:
"dbwrap ctdb:max store retries" (integer) which defaults to 5
and sets the upper limit for the number or repeats of the
fetch/store cycle.
Michael
(This used to be commit 2bcc9e6ecef876030e552a607d92597f60203db2)
in the persistent db_ctdb_store operation.
This is to prevent deadlocks in db_ctdb_persistent_store().
There is a tradeoff: Usually, the record is still locked
after db->store operation. This lock is usually released
via the talloc destructor with the TALLOC_FREE to
the record. So we have two choices:
- Either re-lock the record after the call to persistent_store
or cancel_persistent update and this way not changing any
assumptions callers may have about the state, but possibly
introducing new race conditions.
- Or don't lock the record again but just remove the
talloc_destructor. This is less racy but assumes that
the lock is always released via TALLOC_FREE of the record.
I choose the first variant for now since it seems less racy.
We can't guarantee that we succeed in getting the lock
anyways. The only real danger here is that a caller
performs multiple store operations after a fetch_locked()
which is currently not the case.
Michael
(This used to be commit d004c9a7281d2577c3ba2012c8f790cc198ea700)
Only filled in for tdb so far, for rbt it's pointless, and ctdb itself needs to
be extended
(This used to be commit 0a55e018dd68af06d84332d54148bbfb0b510b22)
The lockup could happen when packet_read_sync() gets two packets in a row, the
first one being an async message, and the second one being the response to a
ctdb request.
Also add some debug msg to ctdb_conn.c, and cut off the "locking key" messages
to only dump 20 hex chars at debug level 10. >10 will dump everything.
(This used to be commit 0a55880a240b619810371a19144dd0a75208adfe)