IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Now we will only have one set of bugs. :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 444521c852749558f39dc6131acce9e47eefd489)
Having other functions call control_ipreallocate() suggests that the
it might look at the argv/argv arguments that are passed. This is not
the case. Change the callers so they call the new ipreallocate()
function instead.
Broadcast CTDB_SRVID_TAKEOVER_RUN to all connected nodes. Inactive
nodes will ignore it. This is safe since we only want 1 reply. If we
didn't get a response, we don't actually care if there's no active
recovery master - just fire, wait, retry, ...
Ignore some failures on the basis that they might be transient, so it
is probably worth retrying.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 4bf0b1c9d21986eecb7682f935bd6154c65533cc)
This has already been stored at connect time and can't fail.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit d8eb2e7fdd7645719370dad4f2faa5c3fffa8249)
The current 3 second timeout is arbitrary and users trip over it
sometimes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit b49c4f39666d5b1596213bf41bcdc47ed3c327ae)
This will allows eventscripts to send information about multiple tcp
connections to a single "ctdb killtcp" command, saving the overhead of
setting up a client connection per tcp connection.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit af5aa369c266430fe912df0c26116b68bac3572e)
At the moment there is no easy way to force a recovery when attempting
to reproduce certain classes of bugs. This option is added without
documentation because it is dangerous until the bugs are fixed! :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 4f87925a287f612a6ab3b5da1a387a31c7bea28f)
This avoids premature exits from "ctdb stop" and "ctdb continue" due to
intermittent control (e.g. getpnn, getnodemap) timeouts.
This needs a proper fix to distinguish between timeout and failure
conditions and take appropriate action.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit c48583fd238496a81ddc46a21892f0b49559036a)
Otherwise callers can't tell the difference between some other failure
(e.g. memory allocation failure) and an unknown tunable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 03fd90d41f9cd9b8c42dc6b8b8d46ae19101a544)
This adds more serialisation to the startup, ensuring that the
"startup" event runs after everything to do with the first recovery
(including the "recovered" event).
Given that it now takes longer to get to the "startup" state, the
initscript needs to wait until ctdbd gets to "first_recovery".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7)
If one or more run states are specified then "ctdb runstate" succeeds
only if ctdbd is in one of those run states.
At the moment, if the "setup" event fails then the initscript succeeds
but ctdbd exits almost immediately. This behaviour isn't very
friendly.
The initscript now waits until ctdbd is in "startup" or "running" run
state via the use of "ctdb runstate startup running", meaning that ctdbd
has successfully passed the "setup" event.
The "setup" event code in 00.ctdb now waits until ctdbd is in the
"setup" run state before proceeding via the use of "ctdb runstate setup".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 4a2effcc455be67ff4a779a59ca81ba584312cd6)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit bf20c3ab090f75f59097b36186347cedb1c445d4)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 9e7b7cd04adc5e66e2ffa4edf463a682aaea379b)
This code tried to find the recovery master and send an ipreallocate
request to that node. When a node is stopped, this code asked the
stopped node for recovery master. Stopped node does not have up-to-date
information on the current recovery master. So ipreallocate requests
were sent to the wrong node and ignored by that node which is not the
recovery master.
Send ipreallocate request to all active nodes. That way we guarantee
that the current recovery master will see it and respond to it.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 0577ce3c68e4febf49a1ef5093e918db9d5ec636)
This avoids clash with version.h from Samba tree.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit d18fcfff674e876abde8d51afec92d9c4a090d2f)
Also, include description of -e option in usage.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 35264e42ade4676468cf7713fa339c784e932953)
Moving the IP is an optimisation so should not cause failure.
Refactor and simplify the retry-move-IP into new function
try_moveip().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 5402f85dde045576cbaf64e01c68e28ed52204e8)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit d1ec06d30148e6fd344625a2fbf1c22391bd908a)
Most of the commands related to database operations can now use the
common code (db_exists()) to refer to database with either name or id.
In addition to return db_id for db_name, the function returns all the
flags set for the database.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit ca6e7eccc90f2869c220231666bf284798342bce)
This fixes the wrong code where same variable 'ret' is used to track the pnn
and the return value of a function call.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 718233c445cd6627ab3962b6565c2655f1f8efd0)
We don't need extra commands for these.
Also, allow a default value of NOTICE for the getlog level.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 7197e600f46f2d1638f6c45c0149f109ea25a47c)
This adds commands rdgetlog and rdclearlog
These are analogous to getlog and clearlog but operate on the logs for
the recovery daemon.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit ef55e06192819d840c09b65741bab737223ac34c)
* Factor out repeated code into new function find_natgw()
* Support both machine and human readable output
* Use libctdb
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a56ec75edd1705b0539513d396d311f0e80a3bf5)
control_getcapabilities(), control_lvs(), control_lvsmaster() updated
to use ctdb_getcapabilities(), ctdb_getnodemap() as appropriate.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c30ec02615183ecf9b412ad415bf1abd859aec45)
This used to catch trailing blank lines. However, these are caught
just as effectively by the whitespace filtering in the loop below.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 7b75a3bb722dc86139b1a07a0100d08c34620b91)
The first line is currently human readable and the rest is machine
readable. This doesn't make sense. Do one or the other...
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit b29d5bbaa7048291c4b3a39bf12e04f0436f67da)
It is already in 2 places and we might use it in another.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 12a0a7a208d1c8fa8991894200d1dc133f3a2d1a)
A list of files is given rather than a command. These files are
pushed to the specified nodes.
Quoting is fragile/broken so filenames with spaces won't work - you
win some, you lose some. :-)
All of the other onnode options should work together with this option.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit aed9b98ddbbf3e81de4f7257a10676565f7d7507)
Originally, "ctdb cattdb" attached explicitly as non-persistent, which
is now forbidden for persistent databases by the server.
Pair-Programmed-With: Gregor Beck <gbeck@sernet.de>
(This used to be ctdb commit 85a367005bd669309bb7e532b60d27621110180d)
Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster.
Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy.
(This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0)
This can improve performance and stop clients from having to chase a rapidly migrating/bouncing record
(This used to be ctdb commit d0d98f7e45e5084b81335b004d50bddc80cdc219)
If used with -n <nodes> the "current" node needs to change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a0340a50c2acd9ccc281faef032a364254f7f95a)
Everytime we give a delegation to another node we count this as one delegation.
If the same record is delegated to several nodes we count one for each node.
Everytime a record has all its delegations revoked we count this as one revoke.
(This used to be ctdb commit b098bcf8007be63889aaed640a951b0eeaa9d191)
An old, buggy version of this code was merged. This fixes it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bc4d5d5f0048487776f9f5d9f04a0af2e5d45aac)
parse_nodestring() checks what this used to check. parse_nodestring()
already has the nodemap.
It was a 50-50 decision to decide whether to update verify_node() to
check against a nodemap that is passed in or to delete it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Conflicts:
tools/ctdb.c
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 2c1baf5ddd3c2cdd2a3f98b8e208d3a48530d1d1)
This is very much like "ctdb status" but actually returns the status
for use in scripts.
This can be used in 2 modes:
* An optional nodestring is passed directly to the command without
using -n. In this mode the current node is asked for the status of
all listed nodes.
* The nodestring is passed via -n. In this mode the designated nodes
will be asked for their status. This is like "ctdb status".
These modes can be mixed. For example:
ctdb nodestatus 1,2,3 -n 0
asks node 0 for the status of nodes 1, 2 and 3, returning the bitwise
OR of their statuses.
This version uses the auto_all functionality, so the output isn't
necessarily pretty. An improved version that does its own -n
processing may appear soon.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 355685a14be17bf4648788f8a72c54790fe03502)
Create 2 new functions: control_status_1_machine() and
control_status_1_human() that contain chunks of code from
control_status(). We're about to find another purpose for these
functions.
This should be a no-op.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit fade71539482e8276f57ba3c003fe004d8666ce7)
Centralise -n nodestring parsing and add the ability to pass a
comma-separated list of node numbers. Listing a node that is
disconnected or deleted results in failure, similar to the way passing
a single node currently works. All of the auto_all commands inherit
this functionality. For now, the non-auto_all commands do not inherit
this - they need to be individually tweaked. Therefore, we haven't
updated the documentation to advertise this feature.
Implemented via a new function parse_nodestring() that parses an
optional (pass NULL when not available to indicate "current node")
comma-separated list of node numbers or "all". parse_nodestring() can
be told to be non-fatal for disconnected/deleted nodes so it can also
be used in other contexts (yes, coming soon). main() is changed to
call this function.
A new magic PNN value CTDB_MULTICAST is added and along with a
corresponding option.nodes structure member (a talloc-ed array of
PNNs). This is also populated for "all" as well.
control_status() has new function pretty_print_flags() factored out so
pretty-printed flags can be used in error/debug messages. New
function is_partially_online() is also factored out - this simplifies
some of the logic.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 920e3a732eb9e09004edde6cfb3c7db8a004016f)
This puts the parsing and checking logic close together. This makes
it easy to change the parsing code. Changed parsing code can now
easily use both old client code and libctdb since both are guaranteed
to be setup.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 57fb074a65dc56168fc3813b79a5bab4b3727cf3)
It's the only one in the file.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bf1174ef699b06485b36ee8ae70412be0759e142)
This allows all that logic to be hacked a little more easily.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit f93ffeee7b9e9ca5dd116655bdc7f89fc987ed8a)
Most of the action in main() happens inside a for loop and an if
statement. This causes 2 levels of extra indent for the code and
makes it harder to read.
Instead, the current body of the loop is put below the loop and its
corresponding failure check.
To see how small this change really is, view with "diff -w". ;-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 8527396b7290cfc8378779631e91d2ae09e2a106)
This didn't have auto_all set as true. However, there's no special
code to handle "-n all" and it just fails. If auto_all works for
status then it might as well work for scriptstatus.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3084220e2aac3664511969f10cad206e505150a0)
This patch changes the callback signature for traversal
functions to allow a client to abort a traverse before it finishes.
Updates to all callers and examples as well as rb-test tool.
(This used to be ctdb commit 8ab0c63ad36cfbbb1e5fed46a1f4c47b1fdb581f)
This case was never tested and fakessh obviously won't handle the
extra arguments.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 02184bd5b9ab94cdf2b9ff92e56a509f92f9e4aa)
Current behaviour is for onnode to timeout (for about 20s) for each
attempted ssh to a down node. With 40 or 50 invocations of onnode
this takes a long time.
2 changes to work around this:
* If EXTRA_SSH_OPTS (which is passed to ssh by onnode) does not
contains a ConnectTimeout= setting then add a setting for a 5 second
timeout.
* Filter the nodes before starting any diagnosis, taking out any "bad
nodes" that are uncontactable via onnode.
In the nodes summary at the beginning of the output, print
information about any "bad nodes".
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 8c3b6427dbaade87e1a0f5590f0894c2e69b31a3)
Add option -e to get the old behaviour and process empty records too.
Signed-off-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit d9859540c2000864bc6c58be5afe19aa3b1064b2)
1/0 is unsuitable since it can be useful to check 'if a column is "1" there is something wrong with that node'
(This used to be ctdb commit b963f5e40b1e73a60363568da88557cad9e58a28)