IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
and a ctdb command to pull the talloc memory map from a recovery daemon
ctdb rddumpmemory
(This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05)
The controls only modify the runtime setting of which public addresses a node
can server and does not modify /etc/ctdb/public_addresses.
To make the change permanent you also need to edit /etc/ctdb/public_addresses
manually.
After ip addresses have been added/deleted you need to invoke a recovery
for the ip addresses to be redistributed.
(This used to be ctdb commit f8294d103fdd8a720d0b0c337d3973c7fdf76b5c)
Add back the controls to enable/disable monitoring we used to have for debugging but removed a while ago
(This used to be ctdb commit 8477f6a079e2beb8c09c19702733c4e17f5032fe)
Vacumming used to delete one record at a time on all nodes, that was
m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all.
The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted.
(This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53)
ctdb moveip <IPADDRESS> <NODE>
which can be used to manually fail an ip address over to a specific node.
This can only be used if DeteministicIPs are disabled and also only if NoIPFailback is enabled.
(This used to be ctdb commit ffee062b7e26a6aa6ad254edb58399040ecaa542)
add a new control that causes the node to drop the current nodes list
and reread it from the nodes file.
During this operation, the node will also drop the tcp layer and restart it.
When we drop the tcp layer, by talloc_free()ing the ctcp structure
add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer
add two new commands for the ctdb tool.
one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file
(This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)
memory tree to stdout. This is much more useful than putting it in the log, and also fixes
a bug where the pipe would overflow internally and cause ctdbd to lockup
(This used to be ctdb commit e236979e2162d9bd7a495086342168a696cf76c5)
ctdb vacuum : vacuums all the databases, deleting any zero length
ctdb records
ctdb repack : repacks all the databases, resulting in a perfectly
packed database with no freelist entries
(This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4)
ctdb_recoverd.c
Always handle banning/unbanning locally on the node that is being
banned/unbanned instead of on the recovery master.
This means that if a ban request comes in to the recovery master for a
remote node, we pass the request on to the remote node instead of
setting up the ban and ban timeouts locally.
ctdb.c
send ban/unban requests to the node being banned/unbanned instead of to
the recmaster
(This used to be ctdb commit 880dd9f5fd0b91e450da93e195cc5c62cb1dcd6e)
monitoring should always be enabled
(though a node may want to temporarily disable running the "monitor"
event scripts but can do so internally without the need for this
control)
(This used to be ctdb commit e3a33618026823e6af845fd8513cddb08e6b5584)
the recmaster or not.
return 0 if the node is the recmaster and 1 (true) if it is not or if
we could not communicate with the ctdb daemon.
call it 'isnotrecmaster' to cope with that if the tool could not bind to
the socket to tyalk to the daemon, the tool will automatically return an
error and exit code 1
thus the tool will only return 0 if it could talk successfully to the
local daemon and if the local daemon confirms this node is the recmaster
(This used to be ctdb commit ae5fcb790b6c3985f514fa8a96bc00c2619f2a28)
and merge into a big list since with the deassociation between a node
and a public ipaddress the /etc/ctdb/public_addresses files can
differ between nodes and no node know about all public addresses that a
cluster can use
(This used to be ctdb commit e208294fed183977cacc44b2cd1195c11d967c18)
this command no longer makes sense when there is no on-to-one mapping
between a node and its default public ip
(This used to be ctdb commit 91280db7f6dd3d659edd86fae21ba347d6f9da9e)
multiple public addresses spread across multiple interfaces on each
node.
this is a massive patch since we have previously made the assumtion that
we only have one public address per node.
get rid of the public_interface argument. the public addresses file
now explicitely lists which interface the address belongs to
(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
controls to register/unregister/check a server id.
a server id consists of TYPE:VNN:ID where type is specific to the
application. VNN is the node where the serverid was registered and ID
might be a node unique identifier such as a pid or similar.
Clients can register a server id for themself at the local ctdb daemon.
When a client dissappears or when the domain socket connection for the
client drops then any and all server ids registered across that domain
socket will also be automatically removed from the store.
clients can register as many server_ids as they want at the same time
but each TYPE:VNN:ID must be globally unique.
Clients have the option of explicitely unregister a server id by using
the UNREGISTER control.
Registration and unregistration can only be done by clients to the local
daemon. clients can not register their server id to a remote node.
clients can check if a server id does exist on any ctdb node in the
network by using the check control
(This used to be ctdb commit d44798feec26147c5cc05922cb2186f0ef0307be)
callback function which is called upon completion (or timeout) of the
control.
modify scanning of recmaster in the monitoring_cluster code to try the
api out
(This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c)
struct so that if we timeout a control we can print debug info such as
what opcode failed and to which node
we dont need the *status parameter to ctdb_client_control_state
create async versions of the getrecmaster control
pass a memory context to getrecmaster
(This used to be ctdb commit 558b680c82f830fba82c283c78c2de8a0b150b75)
places.
create a new helper function to generate new generation id values that
know about the invalid id and avoids generating it.
update the ctdb status tool to know about the invalid generation id and
print the string INVALID instead
(This used to be ctdb commit 4fbcd189543cb8a92227fdcd3d158472e558ccda)
there is an array for each node/public address that contains tcp tickles
we send a TCP_ADD as a broadcast to all nodes when a client is added
if tcp tickles are removed, they are only removed immediately from the
local node.
once every 20 seconds a node will push/broadcast out the tickle list for
all public addresses it manages. this will remove any deleted tickles
from the remote nodes
(This used to be ctdb commit e3c432a915222e1392d91835bc7a73a96ab61ac9)
let the caller create the sending socket and use a single socket instead
of one new one for each tickle.
pass a sending socket to ctdb_sys_send_tcp()
ctdb_sys_kill_tcp is not longer used so remove it
set the socketflags for close on exec and nonblocking in the helper that
creates the sockets instead of in the caller
add a helper to create a sending socket to send tickles from
(This used to be ctdb commit 469f3fb238a0674a2b48fdf1a7e657e32428178a)
the tcp connection
change the 60.nfs script to run ctdb killtcp in the foreground so we
dont get lots of these running in parallel when there are a lot of tcp
connections to rst
(This used to be ctdb commit d81616214752882242f2886e94681972a790db80)
- split out the event script code into a separate module
- get rid of the separate takeover directory
(This used to be ctdb commit 8ea2c923a3e2464200ff79bf2c3f1f89e6a93ad4)
checked in so it is not lost
this tool takes a socketpair as arguments and will reset the tcp
connection
(This used to be ctdb commit bddd448740ef7f5a88b8549a3d184a94ac9fcd96)
to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes)
(This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c)
not the ctdb sysconfig file since this variable has nothing at all to do
with ctdb
(This used to be ctdb commit d17073b7da5ecba1b93a5ed4fbdf86bf052fdc90)
existing /etc/ctdb/released_ips
- only call the statd-callout script if the ips have changed and call
it with a "notify" argument. we need to restart nfslock service in
both cases
- change statd-callout to explicitely restart the lock manager and statd
when "notify" is called. copy the state directory for each held ip
from shared storage to /tmp then use sm-notify to send notifications to
all monitored clients
(This used to be ctdb commit 800f15a27af885a3f83430d3bc411cc72ac40e86)
- use -n to specify node number in ctdb utility
- change 'ctdb status' to 'ctdb statistics'
- added 'ctdb status' which shows status
- added netmask to public IPs, so you don't try a takeover on a
foreign network
- cleaned up tools/ctdb_control.c a lot
- generate usage message at runtime
(This used to be ctdb commit 28de71c03ace7d32a9fd9882fabbd5d668b97656)
add sending of grat arp both normal grat arp (request) and also
unsolicited grat arp replies
(This used to be ctdb commit 7305c00c21c30bdbafc3722a018513378bd307e6)
keepalive traffic for x seconds it is deemed dead
this triggers a recovery after a while if a ctdbd has been STOPPED
but it doesnt recover automatically when the node reappears
(This used to be ctdb commit d6324afe0d13b5e21d06e347caca433c6b36a32a)
- fixed the re-send of ctdb calls after a generation change
- fixed a reqid idr leak in controls
- removed the write_record test code
- use the new nonblock lockall code to prevent ctdbd from ever doing a
blocking lock that could deadlock with smbd
- moved more of the recovery controls into ctdb_recover.c
(This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec)
election is primitive, it elects the lowest vnn as the recovery master
two new controls, to get/set recovery master for a node
to use recovery daemon, start one
./bin/recoverd --socket=ctdb.socket*
for each ctdb daemon
it has been briefly tested by deleting and adding nodes to a 4 node
cluster but needs more testing
(This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3)
of traversing the full cluster.
this makes it easier to debug recovery
update the test script for recovery to reflect the newish signatures to
ctdb_control
the catdb control does still segfault however when there are missing
nodes in the cluster as there are toward the end of the recovery test
(This used to be ctdb commit 8de2a97c14a444f817ceb36461314f10c9601ecc)
one broadcast address for all nodes
and one broadcast address for all nodes in the current vnnmap
update all useage of the old flag to now only broadcast to the vnnmap
except for tools/ctdb_control where it makes more sense to broadcast to
all nodes
(This used to be ctdb commit dfb65b88cf67ad9d61268c4b47a6d8ae346f47df)
this program is a client to the local ctdb daemon
every second it pulls all vnnmap and nodemaps from all nodes that are
available and checks if a recovery is required
a recovery is required if :
* all nodes do NOT have an identical vnnmap and generation
* all nodes do NOT have an identical nodemap
* there are active nodes that are NOT in the nodemap
* there are nodes in the nodemap that are NOT active
During recovery, the recovery tool will also make sure that all nodes
know about and have created all databases.
(This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26)
- allow controls to know which client invoked them
- added a client_id to clients, so they can be identified remotely
- added the ability to remove registered srvids
- in the list_keys code, register a temp srvid, then remove it afterwards
(This used to be ctdb commit 29603c51cc6d81362532cd8e50f75c8360c5f5ef)
dont explicitely free the vnnmap pointer in the getvnnmap control this
is freed by the mem_ctx instead
add code to the recoverd to detect when/if recovery is required
veiry that the number of active nodes, the nodemap and the vnn map is
consistent across the entire cluster and if not trigger a recovery
(which right now just prints "we need to do recovery" to the screen.
(This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5)
change ctdb_control so it takes a timeval pointer as argument.
this is the timeout. if the node has not responded within hte timeout
ctdb_control will return an error instead of hanging.
if the timeval pointer is NULL then the call will block indefinitely if
there is no response.
this is used for now in the createdb control but all the helpers
ctdb_ctrl_* should probably be updated to take a timeout parameter as
well.
(This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211)
for the time being
remove all the [de]marshalling and just pass a structure around instead
(This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c)
this is not optimized at all and copies/merges all records between
databases instead of only those records for which a certain node is
lmaster. (step 7 should later be enhanced to a, delete the database,
push only those records for which the node is lmaster)
(This used to be ctdb commit 509d2c71169e96a8610f9db91293dc7a73c2cc10)
while recovery is in progress the daemon will discard all CTDB_REQ_CALL
and rely on clients retransmitting them
add new controls to get/set the recovery mode
(This used to be ctdb commit 41458a61577885ac49150f830e92e93e634c5411)
it does not yet work since ctdb_control can right now only be called
from client context and the pull is implemented as the target ctdb node
itself using a get_keys to pull the keys from the source node thus
ctdb daemon needs to ctdb_control to a remote node
(This used to be ctdb commit a55c7c64b4ff87f54b90649c9f469b1ff36dc9ea)
fix the logic in ctdb_connected to print CONNECTED if the node is
connected and UNAVAILABLE when the node is dead instead of the opposite
(This used to be ctdb commit 0f431d2f3e0bd94d10fe77e56cf0ed6c48402400)
dmaster request stage, and instead directly send a dmaster
reply. This avoids a race condition where a new call comes in for
the same record while processing the dmaster request
- don't keep any redirect records during a ctdb call. This prevents a
memory leak in case of a redirect storm
(This used to be ctdb commit 59889ca0fd606c7d2156839383a09dfc5a2e4853)