IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
- allow a event script to be specified that will take IPs, release
IPs, and handle recovery in system specific ways
- redirect stderr in subcommands to the log
(This used to be ctdb commit de0fc9ba370db781f9c46406ed180c8211946c7a)
- use -n to specify node number in ctdb utility
- change 'ctdb status' to 'ctdb statistics'
- added 'ctdb status' which shows status
- added netmask to public IPs, so you don't try a takeover on a
foreign network
- cleaned up tools/ctdb_control.c a lot
- generate usage message at runtime
(This used to be ctdb commit 28de71c03ace7d32a9fd9882fabbd5d668b97656)
IP. A raw tcp ack is sent for each tcp connection held by clients
before the IP takeover.
These acks have a deliberately incorrect sequence number, and should
cause the windows client to send its own ack which will in turn cause
a tcp reset and thus cause windows clients to much more quickly
reconnect to the new node.
(This used to be ctdb commit eef38bfe8461b47489d169c61895d6bb8a8f79a1)
add sending of grat arp both normal grat arp (request) and also
unsolicited grat arp replies
(This used to be ctdb commit 7305c00c21c30bdbafc3722a018513378bd307e6)
- fixed a bug in traverse
- get a lock on the node list file in the recmaster recovery daemon
(This used to be ctdb commit 162a5647535ad1cb3e8e5d4042a2784365fb1913)
this leaves only one single function where a node is marked as dead
instead of two places
(This used to be ctdb commit aa764ea26cc26d5c1ae188105236da603576f45b)
keepalive traffic for x seconds it is deemed dead
this triggers a recovery after a while if a ctdbd has been STOPPED
but it doesnt recover automatically when the node reappears
(This used to be ctdb commit d6324afe0d13b5e21d06e347caca433c6b36a32a)
sense to have the daemon requeue the packets if they timeout or fail to
deliver to the remote node
(This used to be ctdb commit 9fb753046787190970654aeb937e96685ac53184)
use this control from the recovery daemon to ensure that the recmaster
always have a higher rsn than andy other node for the records after
recovery completes
(This used to be ctdb commit 6fb6a8b981a804bfcc460c4481c51c7c647230f6)
- fixed the re-send of ctdb calls after a generation change
- fixed a reqid idr leak in controls
- removed the write_record test code
- use the new nonblock lockall code to prevent ctdbd from ever doing a
blocking lock that could deadlock with smbd
- moved more of the recovery controls into ctdb_recover.c
(This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec)
update the recovery test script to start all ctdb daemons with a
recovery daemon
(This used to be ctdb commit 47794e16df285cacefc30208d892d931a6e46b96)
election is primitive, it elects the lowest vnn as the recovery master
two new controls, to get/set recovery master for a node
to use recovery daemon, start one
./bin/recoverd --socket=ctdb.socket*
for each ctdb daemon
it has been briefly tested by deleting and adding nodes to a 4 node
cluster but needs more testing
(This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3)
one broadcast address for all nodes
and one broadcast address for all nodes in the current vnnmap
update all useage of the old flag to now only broadcast to the vnnmap
except for tools/ctdb_control where it makes more sense to broadcast to
all nodes
(This used to be ctdb commit dfb65b88cf67ad9d61268c4b47a6d8ae346f47df)
this program is a client to the local ctdb daemon
every second it pulls all vnnmap and nodemaps from all nodes that are
available and checks if a recovery is required
a recovery is required if :
* all nodes do NOT have an identical vnnmap and generation
* all nodes do NOT have an identical nodemap
* there are active nodes that are NOT in the nodemap
* there are nodes in the nodemap that are NOT active
During recovery, the recovery tool will also make sure that all nodes
know about and have created all databases.
(This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26)
- allow controls to know which client invoked them
- added a client_id to clients, so they can be identified remotely
- added the ability to remove registered srvids
- in the list_keys code, register a temp srvid, then remove it afterwards
(This used to be ctdb commit 29603c51cc6d81362532cd8e50f75c8360c5f5ef)
dont explicitely free the vnnmap pointer in the getvnnmap control this
is freed by the mem_ctx instead
add code to the recoverd to detect when/if recovery is required
veiry that the number of active nodes, the nodemap and the vnn map is
consistent across the entire cluster and if not trigger a recovery
(which right now just prints "we need to do recovery" to the screen.
(This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5)
change ctdb_control so it takes a timeval pointer as argument.
this is the timeout. if the node has not responded within hte timeout
ctdb_control will return an error instead of hanging.
if the timeval pointer is NULL then the call will block indefinitely if
there is no response.
this is used for now in the createdb control but all the helpers
ctdb_ctrl_* should probably be updated to take a timeout parameter as
well.
(This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211)
for the time being
remove all the [de]marshalling and just pass a structure around instead
(This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c)
signature (flags field)
update some calls to ctdb_get_config() to use the new name
ctdb_ctrl_get_config()
change #include "talloc/talloc.h" to #include "lib/talloc/talloc.h" in
lib/events/events.h
(This used to be ctdb commit d2cdd87037b9f0c387228d7d4743da4869929c93)
this is not optimized at all and copies/merges all records between
databases instead of only those records for which a certain node is
lmaster. (step 7 should later be enhanced to a, delete the database,
push only those records for which the node is lmaster)
(This used to be ctdb commit 509d2c71169e96a8610f9db91293dc7a73c2cc10)
attach to databases after the protocol has started. The daemon
broadcasts information on new databases to the other daemons.
This also eliminates the need for the client to know about the hash
between db name and db_id.
(This used to be ctdb commit 3bad91a9d987d4c09fe3322eac23c2733660ad08)
for 2 days.
The main bug was in smbd, but there was a secondary (and more subtle)
bug in ctdb that the bug in smbd exposed. When we get send a dmaster
reply, we have to correctly update the dmaster in the recipient even
if the original requst has timed out, otherwise ctdbd can get into a
loop fighting over who will handle a key.
This patch also cleans up the packet allocation, and makes ctdbd
become a real daemon.
(This used to be ctdb commit 59405e59ef522b97d8e20e4b14310a217141ac7c)
while recovery is in progress the daemon will discard all CTDB_REQ_CALL
and rely on clients retransmitting them
add new controls to get/set the recovery mode
(This used to be ctdb commit 41458a61577885ac49150f830e92e93e634c5411)
it does not yet work since ctdb_control can right now only be called
from client context and the pull is implemented as the target ctdb node
itself using a get_keys to pull the keys from the source node thus
ctdb daemon needs to ctdb_control to a remote node
(This used to be ctdb commit a55c7c64b4ff87f54b90649c9f469b1ff36dc9ea)
this will allow a node to verify that a received pdu is sent from a node
in the same generation instance of a cluster.
(This used to be ctdb commit e32d3ca9a622237c4e2622de98825c0962760d48)
broadcasted to all daemons in the cluster
change the message dispatch routine for sending messages so that it
allows several clients to use the same srvid
messages are then passed on to all clients that have that srvid
(This used to be ctdb commit 05d7ebb3556785f0f17a87d808f31ffe8dac288a)
update ctdb_lmaster() return the lmaster based on this tables contents
initialize the vnn table based on number of nodes for now.
later when recovery is implemented the recovery process will populate
this mapping table.
(This used to be ctdb commit 71e440f6c26ea074f9887237c962101c8cef8c80)
store the idr as the high 16 bits and use a rotating counter for the low
16 bits.
(This used to be ctdb commit 7c763b7b5e6ca54a6df4586893ddaf1b508b4c22)
dmaster request stage, and instead directly send a dmaster
reply. This avoids a race condition where a new call comes in for
the same record while processing the dmaster request
- don't keep any redirect records during a ctdb call. This prevents a
memory leak in case of a redirect storm
(This used to be ctdb commit 59889ca0fd606c7d2156839383a09dfc5a2e4853)
- make ctdb capable of alternative connection (like ib) again, solved the fork problem
- do_debug memory overwrite bugfix (occured using ibwrapper_test with wrong address given)
(This used to be ctdb commit da0b84cda26d544f63841dfd770ed7ebad401944)
ctdb_ltdb_lock_requeue() and a small wrapper
- use ctdb_ltdb_lock_requeue() to fix a possible hang in
ctdb_reply_dmaster(), where the ctdb_ltdb_store() could hang waiting
for a client. We now requeue the reply_dmaster packet until we have
the lock
(This used to be ctdb commit 97cd7aa09ce3abbb5e3e965c5c81668e0c0133a5)
CTDB_FLAG_TORTURE, which forces some race conditions to be much more
likely. For example a 20% chance of not getting the lock on the
first try in the daemon
- abstraced the ctdb_ltdb_lock_fetch_requeue() code to allow it to
work with both inter-node packets and client->daemon packets
- fixed a bug left over in ctdb_call from when the client updated the
header on a call reply
- removed CTDB_FLAG_CONNECT_WAIT flag (not needed any more)
(This used to be ctdb commit 7559dcd184666c3853127e3c8f5baef4fea327c4)
version. The client version is different enough that this is
worthwhile
- enable local shortcut for client version of ctdb_call
- add idr_find_type(), with better error reporting in case of type
mismatch
(This used to be ctdb commit 2388094c5f7b6ce003e86b05620c06217d63b49c)
us to put memory directly in the right context, avoiding quite a few
talloc_steal calls, and simplifying the code
- make the fetch lock code in the daemon fully async
(This used to be ctdb commit d98b4b4fcadad614861c0d44a3854d97b01d0f74)
a client held the chainlock, and the daemon received a dmaster reply
at the same time. The daemon would not be able to process the dmaster
reply, due to the lock, but the fetch lock cannot make progres until
the dmaster reply is processed.
The solution is to not hold the lock in the client while talking to
the daemon. The client has to retry the lock after the record has
migrated. This means that forward progress is not guaranteed. We'll
have to see if that matters in practice.
(This used to be ctdb commit 737e5a1253cb048222c595a474aff71c99fc554f)
occasionally leads to problems if an immediate send on the socket
causes a context switch and the client exiting before the daemon. We
now exit the client when the daemon goes away.
(This used to be ctdb commit b7bed0088e700f25105ceea63640b38804f51e4d)
- fixed memory leaks in the 3 packet receive routines. The problem was
that the ctdb_call logic would occasionally complete and free a
incoming packet, which would then be freed again in the packet
receive routine. The solution is to make the packet a child of a
temporary context in the receive routine then free that temporary
context. That allows other routines to keep or free the packet if
they want to, while allowing us to safely free it (via a free of the
temporary context) in the receive function
(This used to be ctdb commit 304aaaa7235febbe97ff9ecb43875b7265ac48cd)
fetch, to avoid the daemon re-reading it
- suffix the database name with the node name so that testing on
loopback doesn't result in a name collision in the database open
(This used to be ctdb commit ad30a4db75450643ff146c40faa306a021de3dd2)
code. It may be added back later once everything is working nicely,
or simulated using a in-process pipe instead of a unix domain socket
- rewrote the ctdb_fetch_lock() code to follow the new design
(This used to be ctdb commit 5024dd1f305fe1ecc262db2240c56f773b4f28f0)
change ctdb_client_fetch_lock to return a status code instead of a record handle and make it unconditionally fill in data.
change ctdb_client_store_unlock to take ctdb_db and key as arguments instead of a record handle
update the ctdb_fetch.c test to use the clientside helpers for fetching and storing data
(This used to be ctdb commit 22d5d40375e0135916c97945646f94119612615d)
this will be the core of the non-blocking lock idea for ctdb, it will be used
in place of ctdb_ltdb_fetch(), but will also get a lock. It re-starts a request
if it needs to block
(This used to be ctdb commit afa479026cf6293e6a878c8a329cdac035284672)
The problem we have is this:
- we want the client smbd processes to be able to 'shortcut' access
to the ltdb, by directly accessing the ltdb, and if the header of
the record shows we are the dmaster then process immediately, with
no overhead of talking across the unix domain socket
- a client doing a shortcut will use tdb_chainlock() to lock the
record while processing
- we want the main ctdb daemon to be able to set locks on the
record, and when those locks collide with a 'shortcut' fcntl lock,
we want the ctdb daemon to keep processing other operations
- we don't want to have to send a message from a smbd client to the
ctdbd each time it releases a lock
The solution is shown in this example. Note that the expensive fork()
and blocking lock is only paid in case of contention, so in the median
case I think this is zero cost.
(This used to be ctdb commit a3248c3e2b740cd2403acffd3c1f6a33dca0ea03)
note that the store_unlock does not actually do anything yet apart from passing the pdu from client to daemon and daemon responds.
next is to make sure the daemon actually stores the data in a database
(This used to be ctdb commit 167d6993e78f6a1d0f6607ef66925a14993ae6a1)
no locking is yet done and the store_unlock call is still missing
the ./tests/fetch.sh --daemon test fails with parent process dying which needs to be investigated.
(This used to be ctdb commit 7d7141c968950a8856f1be79871932b688bfb07f)
- split client specific routines out of ctdb_daemon.c
- use ctdb_queue code in message send from client to daemon
- use clearer names in client/daemon functions
- use talloc autofree context to avoid global for unlink of socket on
exit
- start on API change for message handler, to allow ctdb messaging to
handle daemon mode with multiple clients
(This used to be ctdb commit 53555db45f3583ae4a32cc3aa9e07fb8ef2a77e3)
we can no longer use this function from the application if we are in daemon mode.
add a horrible "sleep()" to ctdb_test.c to prevent the daemon from dissapearing (parent process died) when the application exits which may happen before the other nodes in the test have finished talking to our daemon
(This used to be ctdb commit 74d35dafe06d71e755f3a58cc58d4b9b56fc821b)
send the correct structure back to a client
assorted other cleanups
(tests/test1.sh now works in daemon mode)
(This used to be ctdb commit f4593754cab750dfdb9384884502e2e1b8fde1f0)
added transport level packet allocator, allowing the transport to
enforce alignment or special memory rules
(This used to be ctdb commit 50304a5c4d8d640732678eeed793857334ca5ec1)
ctdb will now move the dmaster role between nodes after
CTDB_MAX_LACOUNT consecutive accesses by the same node.
(This used to be ctdb commit af87f587d8f70192ecac0125054bf9583a4849a7)
- added --self-connect option to ctdb_test, allowing testing when a
node connects to itself. not as efficient as local bypass, but very
useful for testing purposes (easier to work with 1 task in gdb than
2)
- split the ctdb_call() into an async triple, in the style of Samba4
async functions. So we now have ctdb_call_send(), ctdb_call_recv()
and ctdb_call().
- added the main ctdb_call protocol logic. No error checking yet, but
seems to work for simple cases
- ensure we initialise the length argument to getsockopt()
(This used to be ctdb commit 95fad717ef5ab93be3603aa11d2878876fe868d3)
- split up ctdb layer code into 3 modules
- added a simple test suite
- added packet structures for ctdb_call
- switched to an array for ctdb_node to make vnn lookup easy and fast
(This used to be ctdb commit 8a17460a816a5970f2df8244a06aec55d814f186)
- added basic IO handling for the tcp backend
- added a ctdb_node_dead upcall
- added packet queueing
- adding incoming packet handling
(This used to be ctdb commit 415497c952630e746e8cdcf8e1e2a7b2ac3e51fb)