IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
New function ctdb_read_nodes_file() reads a nodes file into a node
map, which is a useful intermediate format. This function should
replace the node reading code in the ctdb CLI tool. It will also be
useful for sanity checking of nodes files across the cluster.
New function convert_node_map_to_list() converts a node map to a node
array (and associated node count). This fills in the details that
aren't present in the node map. This may also useful as a separate
function later if node list reloading stages the data after a sanity
check - the approach is not yet finalised.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Every time a nodemap is contructed the node IP addresses all need to
be parsed. This isn't very productive use of CPU.
Instead, parse each string once when the nodes file is loaded. This
results in much simpler code.
This code also removes the use of ctdb_address. Duplicating the port
is pointless without an abstraction layer around ctdb_address. If
CTDB gets an incompatible transport in the future then add an
abstraction layer.
Note that the infiniband code is not updated. Compilation of the
infiniband code is already broken. Fixing it will be a separate,
properly tested effort.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Having it require a CTDB context stops ctdb_parse_address() from being
used in more generic code. Just use the existing talloc context for
memory allocations.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Just add a flags parameter to ctdb_add_nodes() and use the same code.
Less is more.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is currently set in 2 places. One of them makes the node loading
code difficult to refactor. Also, when the surrounding code in either
place is touched then it might get broken.
This only needs to be done once at startup, not on every reload. So
do it once in a very obvious way, sacrificing a few CPU cycles for
some added clarity.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Each node reload unnecessarily and incorrectly resets the VNN map,
causing a potentially unnecessary recovery. When nodes are reloaded
any newly deleted nodes should already be disconnected and any newly
added nodes should also be disconnected. This means that reloading
the nodes file should not cause a change in the VNN map.
The current implementation also leaks memory every time the nodes are
reloaded.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It is pointless having a recovery lock but not sanity checking that it
is working. Also, the logic that uses this tunable is confusing. In
some places the recovery lock is released unnecessarily because the
tunable isn't set.
Simplify the logic by assuming that if a recovery lock is specified
then it should be verified.
Update documentation that references this tunable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is part of a migration to Samba's lib/util. CTDB always passes 0
(i.e. no max_size) so use a simple assert() to enforce this, rather
than changing a lot of code that will be discarded anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is important enough that we should see it when the log level is
DEBUG_NOTICE.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit eb8ec5681bfccb26c8ffae72952d54bb0ba46249)
No need to check if the options are set. The options are always set
via static defaults.
No need to talloc_strdup() the values via wrapper functions. The
options aren't going away. Remove now unused ctdb_set_tdb_dir() and
similar functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 1fe82f3d7b610547ff4945887f15dd6c5798a49b)
This allows ctdb_load_nodes_file() to move to ctdb_server.c and
ctdb_set_nlist() to become static.
Setting ctdb->nodes_file needs to be done early, before the nodes file
is loaded. It is now set from CTDB_BASE instead ETCDIR, so setting
CTDB_BASE also needs to be done earlier.
Unhack ctdbd_test.c - it no longer needs to define
ctdb_load_nodes_file().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 20e705e63bd3b20837cc3ac92fdcf2a9650ccfc8)
Currently flags are initialised in 2 places. One of them is in
ctdb_tcp_listen_automatic(), which just seems wrong. This makes the
code easier to follow by just doing it in ctdb_start_daemon().
This means that the flags are now initialised later than previously.
However, it is still done before the transport is started and before
clients can connect.
In future it might make sense to do a similar thing with setting the
PNN. However, the current optimisation is reasonably obvious...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 2bbee8ac23ad5b7adf7122d8c91d5f0d54582507)
In Samba this is now called "tevent", and while we use the backwards
compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now
a separate tevent_fd_set_auto_close() function.
This is based on Samba version 7f29f817fa.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)
This might be a bit less efficient, but experience in winbind has shown that
event callbacks can trigger changes in the socket state in very hard to
diagnose ways.
(This used to be ctdb commit a78b8ea7168e5fdb2d62379ad3112008b2748576)
This is used to mark nodes as being DELETED internally in ctdb
so that nodes are not renumbered if / when they are removed from the nodes file.
This is used to be able to do "ctdb reloadnodes" at runtime without
causing nodes to be renumbered.
To do this, instead of deleting a node from the nodes file, just comment it out like
1.0.0.1
#1.0.0.2
1.0.0.3
After removing 1.0.0.2 from the cluster, the remaining nodes retain their
pnn's from prior to the deletion, namely 0 and 2
Any line in the nodes file that is commented out represents a DELETED pnn
(This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343)
to a remote node and ctdb->methods is NULL.
This can happen when we are in the middle of a normal shutdown of the
daemon and we have already shut down the transport layer (thus setting
ctdb->methods == NULL in the transport layer destructor)
band there is some unprocessed data related to a remote node.
This prevents an ugly race condition where ctdb might sometimes (rare)
cause a core dump during "ctdb shutdown".
(This used to be ctdb commit fc4e8b5a5d3699221620a8d76701c8589f2b4ff1)
This is a hack to allow backtraces under valgrind to show what opcode
is getting uninitialised bytes
(This used to be ctdb commit 67bb12c8f0af5914efb44b76bc6ddbb11fc0fcdf)
If we shutdown the transport and CTDB later decides to send a command out
for queueing, the call to ctdb->methods->allocate_pkt() will SEGV.
This could trigger for example when we are in the process of shuttind down CTDBD and have already shutdown the transport but we are still waiting for the
"shutdown" eventscripts to finish.
If the event scripts now take much much longer to execute for some reason, this
race condition becomes much more probable.
Decorate all dereferencing of ctdb->methods-> with a check that ctdb->menthods is non-NULL
(This used to be ctdb commit c4c2c53918da6fb566d6e9cbd6b02e61ae2921e7)
CTDB_START_AS_DISABLED="yes"
and command line argument
--start-as-disabled
When set, this makes the ctdb node to always start in DISABLED mode and will thus not host any public ip addresses.
The administrator must manually "ctdb enable" the node after it has started when the administrator wants the node to start hosting public ip addresses.
Using this option it is possible to start ctdb on a node without causing any reallocation of ip addresses when it is starting. The node will still merge with the cluster and there will still be a recovery phase but the ip address allocations will not change in the cluster.
(This used to be ctdb commit b93d29f43f5306c244c887b54a77bca8a061daf2)
add a new control that causes the node to drop the current nodes list
and reread it from the nodes file.
During this operation, the node will also drop the tcp layer and restart it.
When we drop the tcp layer, by talloc_free()ing the ctcp structure
add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer
add two new commands for the ctdb tool.
one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file
(This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)
specific instance of ctdbd should bind to. This helps when running a
"virtual" cluster on a single machine where all instcances bind to
different alias interfaces.
If --node-ip is specified, then we will only try to bind to this ip
address only. Othervise we fall back to the original method trying the
ip addresses in /etc/ctdb/nodes one by one until we find one we can bind
to.
No variable in /etc/sysconfig/ctdb added since this parameter only makes
sense in a virtual test/debug cluster.
(This used to be ctdb commit d96cb02c2c24f9eabbc53d3d38e90dea49cff3e0)
check in the recovery daemon that all nodes are using the same
/etc/ctdb/nodes file.
Also add some more missing checks that the pnn used is a valid pnn
before using it to dereferencing the ctdb->nodes array
This is useful since it allows us to add more physical nodes to a an
existing cluster without having to bring down the entire cluster.
The to add an additional node to an existing cluster would then be
1, on all nodes set CheckNodesFile=0 using 'ctdb setvar'
2, on all nodes add CTDB_SET_CheckNodesFile=0 to /etc/sysconfig/ctdb
For each each node, one at a time :
3, use 'ctdb disable' to stop the hosted services
4, service ctdb stop
5, service ctdb start
Once all nodes have been restarted
6, on all nodes remove CTDB_SET_CheckNodesFile=0 from
/etc/sysconfig/ctdb
7, on all nodes set CheckNodesFile=0 using 'ctdb setvar'
8, configure and start up the new node
During this procedure, only one node at a time was brought
down/restarted and was so only for a short period.
(This used to be ctdb commit 462501a32143e943ce350bd904a47c0955414a51)
shut down and restart the transport
othervise, if we use the tcp transport the tcp connection might try to
retransmit the queued data during the time the node is unavailable.
this together with the exponential backoff for tcp means that the tcp
connection quickly reaches the maximum backoff rto which is often 60 or
120 seconds. this would mean that it could take up to 60/120 seconds
before the tcp layer detects that the connection is dead and it has to
be reestablished.
(This used to be ctdb commit 0256db470879ce556b0f00070f7ebeaf37e529ab)
set the node initially unhealthy and let the status monitoring bring the node online.
This fixes a problem with winbindd, where it refused to start because secrets.tdb was not populated
but we could not populate ctdbd, because the net command would not run while ctdbd was still doing startup
and thus frozen
(This used to be ctdb commit 3a001b793dd76fb96addf1e2ccb74da326fbcfbc)