IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The pending calls are migration requests received from clients (over unix
domain socket) which are under processing. After a recovery is finished,
any requests which are under processing will be dropped since they do
not belong to the current generation. All the pending call requests
are resent with new generation to restart record migrations.
This is in preparation for parallel database recovery.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Deferring packets has a nasty interaction with recovery. All deferred
packets must be dropped when recovery happens, since those packets are
tracked as pending requests and will be re-sent with new generation.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Sep 5 09:30:50 CEST 2014 on sn-devel-104
When CTDB receives DMASTER_REQUEST or DMASTER_REPLY packet, the specified
record needs to be updated as soon as possible to avoid inconsistent
dmaster information between nodes. During this time, queue up all calls
for that record and process them only after dmaster request/reply has
been processed.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Seeing these with -Wall:
../server/ctdb_call.c:1117:3: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
record_flags = *(uint32_t *)&c->data[c->keylen + c->datalen];
^
memcpy() seems to be the easiest way to get fix these. The
alternative would be to use unmarshalling functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Revoking readonly record involves first marking the record on dmaster as
RO_REVOKING_READONLY. Then all the other nodes are sent update_record
control to get rid of RO_DELEGATION. Once that succeeds, the record
is marked RO_REVOKING_COMPLETE.
Currently, revoking of readonly delegations on the nodes is tried only
once. If a node goes in recovery, it can fail update_record control and
revoke code will abort ctdb. Since database recovery would revoke all
readonly delegations anyway, there is no reason to abort. Simply undo
the start of revoke process by resetting RO_REVOKING_READONLY flag.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Aug 13 11:24:09 CEST 2014 on sn-devel-104
This patch makes the subsequent logic change small and easier to
understand.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
In case of control timeouts, readonly revoke code currently aborts. This
needs to be fixed. Meanwhile, using control_timeout instead of 5 seconds,
increases the timeout to 60 seconds.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Mar 31 07:20:48 CEST 2014 on sn-devel-104
This removes data types and structure elements related to TRANS2
persistent transaction code.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 22a253b7ccf1ff854cddf0b67969dc84d7d6a654)
This was the comment block I was touching and meant to adapt in
commit 00d3bf092e2f72eda330978c75ec85f17e870553.
My search was apparently not unique...
Signed-off-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 09940255011b119dc6af3304f5d3e9568e6006fd)
This should avoid memory bloat when a request bounces between nodes.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 7677fb263f06a97398e2c546e32273fb96edca69)
This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504.
This is a premature optimization. Record can bounce between nodes
very quickly if it is a contended record. There is no need to hold a
record on a node unnecessarily. In case record contention becomes bad,
enabling sticky records on a database is a better idea.
Conflicts:
include/ctdb_private.h
server/ctdb_tunables.c
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit ac417b0003f0116f116834ad2ac51482d25cfa0d)
Instead of logging from ctdb_request_call(), log the message from
ctdb_make_record_sticky(). That way if the record is already sticky, the
message is not repeated unnecessarily.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 44a64d1c388bfe3c3388b191edfaedecfb7bb831)
This helps distinguish processes in process list in top, perf, etc.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 2493f57ce268d6fe7e4c40a87852c347fd60d29e)
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit f99eb2f56d8ca27110a45ae0e1c4bff40ac7a60e)
This is now done in ctdb_ltdb_store_server(), so this
extra bump can be spared.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit cad3107b12e8392f786f9a758ee38cf3a3d58538)
Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned.
Capture SIGCHLD to track also which child processes have terminated.
Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a
(This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)
This can improve performance and stop clients from having to chase a rapidly migrating/bouncing record
(This used to be ctdb commit d0d98f7e45e5084b81335b004d50bddc80cdc219)
This can improve performance slightly on certain workloads where smbds frequently read from the same record
(This used to be ctdb commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504)
Everytime we give a delegation to another node we count this as one delegation.
If the same record is delegated to several nodes we count one for each node.
Everytime a record has all its delegations revoked we count this as one revoke.
(This used to be ctdb commit b098bcf8007be63889aaed640a951b0eeaa9d191)
In a few places functions are called, the return code is assigned into
a variable but it is not checked. This generates a compiler warning
like this:
warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
Instead we remove the warning by checking the return code variable and
log a warning at DEBUG level if the return code indicates an error.
The justification is that there may have been a future intent to check
the return code but it hasn't been important enough to follow-up. If
it matters, it will be logged for easy debugging.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 1932466c76de2b184c2a257120768ab8c9d6c12a)
Fix bug when ctdbd updates the local copy of a delegated record to write the correct
amount of data to the record.
(This used to be ctdb commit 8814d8bc159a5e368afaa236ac7d865165db04b2)
Remove the code in the example client code that writes the record to the local tdb.
Add code to the local ctdbd processing of replies to check if this reply contain a ro delegation and if so, spawn a child process to lock the tdb and then write the data.
(This used to be ctdb commit bf1d429227dc4f5818263cc39401d0a22663cdba)
This implements the ReadOnly and ReadWrite request processing, delegation and revoking of delegations for all requests coming in across the network from a remote node.
(This used to be ctdb commit 78f2c2ea70e6270cec59db7c3f174a511bf608a9)
This triggers a child process to be created to perform the actual potentially blocking calls that are required.
(This used to be ctdb commit 7d575ee92c95bc4aab78a33bc1aac7ff0811ab3a)
Once the contexts are freed, the deferred calls are re-issued to the input packet processing functions again.
This is needed when/if a CALL can not currently be processed by the main engine due to the record being locked down for revoking of all delegations.
The data is passed through several layers of callbacks, and finally a timed event callback to ensure that the processing of the packet will be restarted again at the topmost eventloop, avoinding event loop nesting.
(This used to be ctdb commit cc6f78efcfa3b8caeffbd68018e6dfbf81488dce)
We should release the lock on the record before returning; otherwise the
recovery (which tries to freeze the database) will fail. Symptoms are as
follows:
ctdbd: pnn 15 dmaster request for new-dmaster 19 from non-master 1 real-dmaster=5 key f049c3c8 dbid 0x6cf2837d gen=1148812532 curgen=1148812532 c->rsn=2 header.rsn=15 reqid=2147483585 keyval=0x4f464e49
ctdbd: ctdb_req_dmaster from non-master. Force a recovery.
...
ctdbd: freeze_lock-1:server/ctdb_freeze.c:55 Failed to lock database registry.tdb
CQ:1022545
(This used to be ctdb commit 38b2dbe0605816742e74e2b8a811eaba99c7e12d)
This temporary flag is used for the local record storage function to
decide whether to delete an empty record which has never been migrated
with data as part of the fast-path vacuuming process or, or to store
the record.
(This used to be ctdb commit c11ca778ee90444c44dee0a629cd2eefa3a1f75e)
This way, the MIGRATED_WITH_DATA information can be transported
along with the records. This is important for vacuuming to function
properly.
The record flags are appended to the data section of the ctdb_req_dmaster
and ctdb_reply_dmaster structs.
Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>
(This used to be ctdb commit 945187d64cfc7bd30a0c3b0d548cbe582d95dde3)
If we find a situatior where we get a stray packet with the wrong
dmaster, dont suicide with ctdb_fatal() since this is too disruptive.
Just drop the stray packet and force a recovery to make sure all is good again.
CQ S1022004
(This used to be ctdb commit 62b7fe853db37c0a90e48a0332a3426a8dcb4ed8)
This concept didnt work out and it is really just as expensive as a full migration
anyway, without the benefit of caching the data for subsequence accesses.
Now, migrate the records immediately on first access.
This will be combined with a "cheap vacuum-lite" for special empty records to
prevent growth of databases.
Later extensions to mimic read-only behaviour of records will include proper shared read-only locking of database records, making the laccessor/lacount read-only access to the data obsolete anyway.
By removing this special case and handling of lacount laccessor makes the codapath where shared read-only locking will be be implemented simpler, and frees up space in the ctdb_ltdb header for use by vacuuming flags as well as read-only locking flags.
(This used to be ctdb commit 155dd1f4885fe142c6f8bd09430f65daf8a17e51)
when we migrate a non-empty record onto the node
or a non-empty record off the node
When we migrate a record back to the lmaster and yield the dmaster role,
inspect this flag if if it is still not set, we can delete the record from
the local database as soon as we have migrated it back to the lmaster.
(This used to be ctdb commit a8cc35191df1cd4b866897df71d317ce5f198cb5)
During this window we may still hit asynchronous events that will fail because we can not send/receive packets from other nodes.
These messages are logged as ... Transport is DOWN. To help indicate that they are benign messages related to the process of shutting down.
These messages spam the syslog during normal shutdown, so this patch will drop the loglevel of these messages to DEBUG, so that they will not appear in or spam the syslog.
(This used to be ctdb commit 8275d265d2ae19b765e30ecf18f6b6319b6e6453)
In Samba this is now called "tevent", and while we use the backwards
compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now
a separate tevent_fd_set_auto_close() function.
This is based on Samba version 7f29f817fa.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)
If a noremote node hangs for an extended period, it is possible
that we might have a DMASTER request in flight for record A to that node.
Eventually we will reuse the idr, and may reuse it for a DMASTER request to a different node for a different record B.
If while the request for B is in flight, the first tnode un-hangs and responds back
we would receive a dmaster reply for the wrong record.
This would cause a record to become perpetually locked, since inside the daemon we would tdb_chainlock(dmaster_reply->pdu->key) but once the migration would complete we would chainunlock idr->state->call->key
Adding code to verify that when we receive a dmaster reply packet that it does in fact match the exact same key that the state variable we have for the idr in flight.
(This used to be ctdb commit 2f6a870d7ff02ceb61fde242f752dccbfcb4cb37)
DEBUG(DEBUG_DEBUG,("pnn %u dmaster response %08x\n", ctdb->pnn, ctdb_has
DEBUG(DEBUG_DEBUG,("pnn %u dmaster request on %08x for %u from %u\n",
(This used to be ctdb commit a3473e7a445b14520a49585c460429dfbfe1fce0)
This gets just too noisy on a busy system.
And it is purley informational anyways...
Michael
(This used to be ctdb commit 7f64a00c76203fdf6673c3f862a4bfd17fb848d7)
With the new vacuuming code, dont treat an invalid dmaster as fatal. Let it update to the new value insetad.
(This used to be ctdb commit 5b70fa8cfd5916d3c212823ad5cc1b251ae175ed)
In ctdb_client.c:ctdb_transaction_commit(), after a failed
TRANS2_COMMIT control call (for instance due to the 1-second
being exceeded waiting for a busy node's reply), there is a
1-second gap between the transaction_cancel() and
replay_transaction() calls in which there is no lock on the
persistent db. And due to the lack of global state
indicating that a transaction is in progress in ctdbd, other nodes
may succeed to start transactions on the db in this gap and
even worse work on top of the possibly already pushed changes.
So the data diverges on the several nodes.
This change fixes this by introducing global state for a transaction
commit being active in the ctdb_db_context struct and in a db_id field
in the client so that a client keeps track of _which_ tdb it as
transaction commit running on. These data are set by ctdb upon
entering the trans2_commit control and they are cleared in the
trans2_error or trans2_finished controls. This makes it impossible
to start a nother transaction or migrate a record to a different
node while a transaction is active on a persistent tdb, including
the retry loop.
This approach is dead lock free and still allows recovery process
to be started in the retry-gap between cancel and replay.
Also note, that this solution does not require any change in the
client side.
This was debugged and developed together with
Stefan Metzmacher <metze@samba.org> - thanks!
Michael
(This used to be ctdb commit f88103516e5ad723062fb95fcb07a128f1069d69)
This can cause a memory leak if the call is terminated before we have managed to respond to the client.
(and the call is talloc_free()d but the data is still hanging off ctdb)
instead we must talloc_steal() the data and hang it off the call structure to avoid the memory leak.
In order to do this we must also change the call structure that is passed into ctdb_call_local() to be allocated through talloc().
This structure was previously either a static variable, or an element of a larger talloc()ed structure (ctdb_call_state or ctdb_client_call_state) so
we must change all creations of a ctdb_call into explicitely creating it through talloc()
(This used to be ctdb commit 4becf32aea088a25686e8bc330eb47d85ae0ef8f)
this allows us to print from which node Invalid or Dropped orphan become
dmaster packets came from
(This used to be ctdb commit 88efd1bf4c796cd2b184156b72296587bc38bb40)