IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
statd callout will shortly be updated to use NFS utils' sm-notify.
This tiny helper will be used to create on-disk state files used by
sm-notify. These state files contain endian-specific fields, so
better to write a simple C implementation than to do crazy things in a
shell script (or call out to Python).
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If an NFS service check is set to, say, unhealthy_after=2 then it will
always switch from the (default startup) unhealthy state to healthy,
even if there is a fatal problem. If all services/scripts appear OK
then the node will become healthy. When the counter hits the limit it
will return to unhealthy. This is misleading.
Instead, never use the counter at startup, until the service becomes
healthy. This stops services flapping unhealthy-healthy-unhealthy.
A side-effect is that a service that starts in a broken state will
never be restarted to try to fix the problem. This makes sense. The
counting and restarting really exist to deal with problems that might
occur under load. The first monitor events occur before public IPs
are hosted, so there can be no load. If a service doesn't start
reliably the first time then the admin probably wants to know about
it.
nfs_iterate_test() is updated to run an initial monitor event to mark
the services as healthy. This initialises the counter so it can be
used for the important part of the test. Passing the -i option avoids
running the extra monitor event, so the first iteration will be the
initial monitor event.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes initial failure to retrieve statistics less likely to
result in a statistics change. To help with this, statistics
retrieval stderr now goes to the log - only stdout goes to the file.
This means that the test code for checking statistics changes needs to
be redone to actually run the statistics command and check. As with
rpcinfo output, this output needs to behave as deterministically in
the test code as it done in the event script.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Checking statistics is only really relevant to timeouts. That is, if
an rpcinfo times out it is worth checking if the service making
progress. If the RPC service is not registered then the statistics
don't need to be checked because they shouldn't be changing.
The 2 previously added tests added to check statistics progress now
behave identically and fail on all iterations. To support testing
with "timeouts", an optional TIMEOUT flag can now be added to the RPC
service passed to nfs_iterate_test(). 2 new tests are added to
exercise the new behaviour.
The 2 new "if" statements in nfs_iterate_test() could be combined.
However, a subsequent commit would split them and would be more
difficult to read.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Update the remaining RPC monitoring tests to use nfs_iterate_test(),
depending on it to set results. This makes all RPC monitoring tests
consistent, so they will all benefit from future improvements.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Doing this in a previous commit would have made it more difficult to
read that commit.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The early exits from the sub-shell make the obvious successes much
more obvious, and slightly simplify the code that follows.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Handling this across two different functions led to insanity, so
simplify.
The handling of unhealthy_after when $_numfails = 0 implicitly causes
the node to be healthy. This is how the "rpcinfo succeeds" case
works. Doing it this way for statistics makes this patch easier to
read. The implicit behaviour will go away in the next patch.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The current structure here is wrong and repetitive. Checking rpcinfo
result and determining output should be in the same place.
Failure counting is now contained in
rpc_set_service_failure_response(), but needs a file to survive the
sub-shell.
Don't attempt to combine and simplify code yet. That would make this
commit harder to review.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The output file is initialised, so doesn't need to be created on
success. Treat the return code file the same way.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Nothing more complex is ever done, so we might as well simplify and
reduce coupling.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If an RPC service is given, it is automatically marked down. This
avoids repetition in test cases and loosens coupling.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This one is in a rarely used error path, so call a function that
talloc()s the string instead.
Again, this will also print the port, which might be useful if we ever
add the ability to also specify ports in the nodes list.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Tue Aug 20 14:24:14 UTC 2024 on atb-devel-224
Same thing several times, so change to common failure code.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Node has been found, so use the pre-constructed name instead of
calling ctdb_addr_to_str().
This will also print the port, which might be useful if we ever add
the ability to also specify ports in the nodes list.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
The current constant value doesn't respect CTDB_TEST_MODE/CTDB_BASE.
Instead use the path module to allow automatic listening in test mode
with local daemons.
A single node can be tested with local daemons, using something like:
$ tests/local_daemons.sh foo setup -n 1 -C "node address"
$ grep "node address" foo/node.0/ctdb.conf
# node address = 127.0.0.1
$ tests/local_daemons.sh foo start all
$ tests/local_daemons.sh foo print-log 0 | grep -i chose
... node.0 ctdbd[24546]: ctdb chose network address 127.0.0.1:4379
The trick is that commenting out the node address in ctdb.conf means
the chosen node address is the first one from the nodes file that
allows bind/listen. In this case it is the only line.
The following ensures that automatic listening works for a node that
isn't the first:
$ cat >mynodes
192.168.1.1
127.0.0.1
$ tests/local_daemons.sh foo setup -n 2 -N mynodes -C "node address"
$ grep "node address" foo/node.1/ctdb.conf
# node address = 127.0.0.1
$ tests/local_daemons.sh foo start 1
$ tests/local_daemons.sh foo print-log 1 | grep -i chose
[...] node.1 ctdbd[22787]: ctdb chose network address 127.0.0.1:4379
Note that the first address isn't local on this host, so will always
fail.
So, doing the above and starting both nodes yields...
...
$ tests/local_daemons.sh foo start 1
$ sleep 3; tests/local_daemons.sh foo start 0
$ tests/local_daemons.sh foo print-log all | grep -i 'chose\|bind'
[...] node.1 ctdbd[26351]: ctdb chose network address 127.0.0.1:4379
[...] node.0 ctdbd[26438]: ctdb_tcp_listen_addr: Failed to bind() to socket - Address already in use (98)
[...] node.0 ctdbd[26438]: Unable to bind to any node address - giving up
... as expected.
It would be nice to add tests for this, but we don't really have
infrastructure for that. At least manual testing shows, for the
obvious cases, the previous commits didn't break anything. :-)
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
The node name is already constructed when the nodes file is loaded, so
just copy the node name.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Code to setup the transport is about to be cleaned up, including
removing uses of ctdb_set_error(), so avoid logging a NULL pointer or
some other old error.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Add the initial documentation of the node list configuration parameter.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Aug 6 01:50:12 UTC 2024 on atb-devel-224
Add a single line USENODESCOMMAND directive to the fake ctdb in order to
enable use of a nodes script instead of a nodes file. For simplicity
the fake ctdb always uses `nodes.sh` in the CTDB_BASE.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Similar to the recent changes to the ctdb server code, add the ability
to load the nodes from a subprocess stdout.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
In a future commit we will add support for loading the config file from
the `ctdb` command line tool. Prior to this change the config file load
func always called D_NOTICE that causes the command to emit new text and
thus break all the tests that rely on the specific test output (not to
mention something users could notice). This change plumbs a new
`verbose` argument into some of the config file loading functions.
Generally, all existing functions will have verbose set to true to match
the existing behavior. Future callers of this function can set it to
false in order to avoid emitting the extra text.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Rename ctdb_load_nodes_file to ctdb_load_nodes as it can now load nodes
from more than a regular file.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Rename the `struct ctdb_context` field nodes_file to nodes_source to
better match that the field may indicate something other than a true
file.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Use the new "nodes list" configuration option. Executing the given path
if the path is prefixed by a `!`. The use case is to decouple the nodes
file from the shared storage, especially in the case where the shared
storage is provided by a vfs module.
For an example, imagine a script that runs `curl` on a URL for a
highly-available web server where the URL provides the content
of the nodes file.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Add a "nodes list" configuration option to the [cluster] section of the
ctdb server config. This option will be used similarly to the `cluster
lock` parameter works. When unset it defaults to the same value as
before (/etc/ctdb/nodes). If given a path that is not prefixed by `!` it
instead loads the nodes file from the given path If given a path
prefixed by `!` then it executes the path as a command and reads the
standard output as if it were the content of the nodes file.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Add ctdb_read_nodes_cmd a function that works similarly to
ctdb_read_nodes_file but reads the nodes list from the stdout of a
subprocess instead of a file in the file system.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
While here, fix a trivial memory leak (ctdbd will exit anyway if this
function fails).
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Jul 23 12:39:18 UTC 2024 on atb-devel-224
ctdb_control_getnodesfile() calls ctdb_read_nodes(), which returns a
struct ctdb_node_map rather than the old version, so update associated
marshalling. While here modernise a debug message and wrap the
function arguments.
For ctdb_load_nodes_file() to use ctdb_read_nodes(), tweak
convert_node_map_to_list() to also use the modern node map structure.
Remove unused copy of ctdb_read_nodes_file().
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
These functions are intended to be used in ctdbd, the ctdb tool and
fake_ctdbd, replacing the different copies in each place.
ctdb_read_nodes() will replace ctdb_read_nodes_file(). The name
change is intentional - in future the location may be something other
than a simple filename.
The static copies of ctdb_read_nodes_file() and node_map_add() are
slightly sanitised versions of those in tools/ctdb.c, with a call to
ctdb_parse_node_address(). A bit more care is taken in node_map_add()
to avoid undefined behaviour if talloc_realloc() fails.
ctdb_parse_node_address() will replace ctdb_parse_address(). There is
an obvious argument change, since the ctdb context argument was
unused. It can only fail on an invalid node address, so return a
bool. This function might be changed later to allow the input address
string to include an optional port.
Where to put this module isn't entirely clear. It could go in common,
so be part of ctdb-util. However, if it later needs
ctdb-conf (e.g. to allow the node list location to be configurable)
then there would be a direct cyclic dependency. This is configuration
handling, so conf/ seems sane. However, I didn't want to put it into
the ctdb-conf target, since some code might need to parse a nodes list
but not need to parse ctdb.conf.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
The current fake_ctdbd code for reloading the nodes file overruns the
allocation when adding a deleted node at the end. This is a very
unlikely case, but it might as well work.
Check the size of the internal node map when marking a node deleted.
Also, update the code that adds a node to correctly set the deleted
flag when appropriate.
The included test case tests this.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Guenther Deschner <gd@samba.org>
Autobuild-User(master): Günther Deschner <gd@samba.org>
Autobuild-Date(master): Wed Jul 17 00:06:53 UTC 2024 on atb-devel-224
There are no existing tests to exercise node IP address change
detection.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Guenther Deschner <gd@samba.org>
Fails with some compilers with
error: expected ';', ',' or ')' before 'lineptr'
Signed-off-by: Björn Baumbach <bb@sernet.de>
Reviewed-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Jo Sutton <josutton@catalyst.net.nz>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Jul 2 23:52:37 UTC 2024 on atb-devel-224
It has been a while since --with-libcephfs option was dropped. Therefore
stop advertising it through waf scripts.
Signed-off-by: Anoop C S <anoopcs@samba.org>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Tue Jul 2 09:13:20 UTC 2024 on atb-devel-224
CTDB uses a queue to receive requests and send answers. It works
asynchronously using the tevent framework. However there was an issue
that gave priority to the receiving side so, when a request was
processed and the answer posted to the queue, if another incoming
request arrived, it was served before sending the previous answer.
This scenario could repeat for long periods of time if the frequency of
incoming requests was high enough.
Eventually, a small time gap between incoming request gave a chance to
process the pending output queue, sending many answers in a burst.
This patch makes sure that both queues (input and output) are processed
if the event contains the appropriate flag.
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Mon Jul 1 09:17:43 UTC 2024 on atb-devel-224
We might end up using it elsewhere.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Guenther Deschner <gd@samba.org>
Reviewed-by: Anoop C S <anoopcs@samba.org>
Leave common/conf.[ch] where they are to make this commit
comprehensible.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Guenther Deschner <gd@samba.org>
Reviewed-by: Anoop C S <anoopcs@samba.org>
rpc.statd is single-threaded and runs its HA callout synchronously. If
it is too slow then latency accumulates and rpc.statd's backlog grows.
Running a pair of add-client/del-client events with the current code
averages ~0.030s in my test environment. This mean that 1000 clients
reclaiming locks after failover can easily cause 10s of latency. This
could cause rpc.statd to become unresponsive, resulting in a time out
for an rpcinfo-based health check of the status service.
Split the add-client/del-client events out to a standalone
statd_callout executable, written in C, to be used as the HA callout
for rpc.statd. All other functions move to statd_callout_helper.
Now, running a pair of add-client/del-client events in my test
environment averages only ~0.002s. This seems less likely to cause
latency problems.
The standalone statd_callout executable needs to read a configuration
file, which is generated by statd_callout_helper from the "startup"
event. It also needs access to a list of currently assigned public
IPs.
For backward compatibility, during installation a symlink is created
from $CTDB_BASE/statd-callout to the new statd_callout, which is
installed in the helper directory.
Testing this as part of the eventscript unit tests starts to become
even more of a hack than it used to be. However, the dependency on
stubs and the corresponding setup of fake state makes it hard to move
this elsewhere.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Jun 25 04:24:57 UTC 2024 on atb-devel-224
There is a typo here, since there will be no process called "status".
Instead of fixing it, drop this because rpc.statd isn't the focus of
this monitoring check and when systemd is init rpc.statd isn't
restarted with nfs-ganesha. It stays running, so a confusing stack
trace for rpc.statd is always logged.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If ganesha.nfsd is gone then a node can't provide an NFS service, so
should be marked unhealthy. A later restart may bring it back to
health.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This one does an rpcinfo check, along with statistics mitigation. It
can be used in combination with the existing 20.nfs_ganesha.check.
The equivalent kernel NFS file only restarts every 10 failures. This
one can be a little more proactive given that false positives are less
likely with the statistics mitigation.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Simplicity is preferred here over absolute correctness. If the
ganesha_stats command exits with an error or times out then no output
is produced so, implicitly, the statistics do not change. Also, the
statistics always change at startup. However, it is likely that the
statistics change when NFS makes progress and do not change when NFS
does not make progress.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
When monitoring an RPC service, the rpcinfo command might time out
even though the service is making progress. In this case, it is just
slow, so counting the timeout as a failure and potentially restarting
the service will not help. The problem is determining if a service is
making progress.
Add a new NFS checks service_stats_command. This command is intended
to run a statistics command. The output is naively compared using
cmp(1). If the output changes then rpcinfo failures are converted to
successes.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Document the new optional argument to specify the namespace to be
associated with RADOS objects in a pool.
Pair-Programmed-With: Anoop C S <anoopcs@samba.org>
Signed-off-by: Günther Deschner <gd@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Autobuild-User(master): Anoop C S <anoopcs@samba.org>
Autobuild-Date(master): Fri Jun 14 07:42:25 UTC 2024 on atb-devel-224
RADOS objects within a pool can be associated to a namespace for
logical separation. librados already provides an API to configure
such a namespace with respect to a context. Make use of it as an
optional argument to the helper binary.
Pair-Programmed-With: Anoop C S <anoopcs@samba.org>
Signed-off-by: Günther Deschner <gd@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: David Disseldorp <ddiss@samba.org>
While the PID check is worth it in relevant cases, NFS-Ganesha still
might go away after the check. Unfortunately, neither grace command
fails an indicative exit code, so invent one by checking error
messages. This can then be converted to success by the caller.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Thu May 30 12:50:01 UTC 2024 on atb-devel-224
If monitoring has failed because it isn't running, then don't fail
"startipreallocate" or "relaseip" by trying to go into grace.
Don't check this for "takeip". In that case NFS-Ganesha had better be
running.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
No need to grovel around in /proc. ps will happily tell us the
command.
Factor out the actual check into a separate function that can be used
elsewhere.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Path values do not need to have quotes. The current code fails if
there aren't any.
Instead, implement a 2 stage parser using 2 sed commands. See
comments in the code for details.
Regexps are POSIX basic regular expressions, apart from \<WORD\> (used
to ensure WORD is on word boundaries, and the 'i' flag for case
insensitivity. The latter is supported in FreeBSD sed.
This code successfully parses Path values out of the following
monstrosity:
path = "/foo/bar1;a";
Path = /foo/bar2;
Something = false;
Pseudo = "/foo/bar3x" ; Path = "/foo/bar3; y" ; Access_type = RO;
Pseudo = "/foo/bar4x" ; path=/foo/bar4; Access_type = RO;
Pseudo = "/foo/barNONONO" ; not_Path=/foo/barNONONO; Access_type = RO;
Path = /foo/bar5
Pseudo = "/foo/bar6x Path=foo" ; Path=/foo/bar6; Access_type = RO
This is probably the best that can be done within a shell script.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Exports may be contained in an include file rather than the top-level
ganesha.conf.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
An IP address is passed to these actions.
Reported-by: Arnab Tah <atah@ddn.com>
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
This simplifies and removes a bad hack. Also, in my test environment,
it also drops the average time take to run an add-client/del-client
pair from ~0.055s to ~0.030s.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Take advantage of new function find_statd_sm_dir() when clearing the
local system statd state directory, so it uses the correct directory
when running on a non-RH distro.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
For add-client and del-client, statd-callout is called by rpc.statd,
which runs as rpcuser, statd or some other non-root system user. This
means that add-client and del-client can't write in the statd-callout
state directory if it is only writable by root. rpc.statd must be
able to write to its own local system statd state directory, so find
this directory and use it as a reference to set the ownership of
CTDB's statd-callout state directory.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
rpc.statd runs statd-callout as a non-root user, which is currently
hacked around using some sudo logic that fails to work in some
contexts (e.g. in a container).
Use $CTDB_MY_PUBLIC_IPS_CACHE to access the node's currently assigned
public IPs, for add-client/del-client. This avoids connecting to
ctdbd when called from rpc.statd.
Also, use $CTDB_MY_PUBLIC_IPS_CACHE in other places where it makes
sense.
Connections to ctdbd are still made in the "notify" action, but this
is always run as root.
In the test code, set the PNN after public addresses setup so that the
cache of assigned IPs correctly initialised.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
This is called in a couple of places without an argument, so give it a
default.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
This is way more complicated than I would like but, as per the
comment, this is due to complexities in the way public IPs work. The
main consumer will be statd-callout, which will then be able to run as
a non-root user.
Also generate the cache file in test code, whenever the PNN is set.
However, this can cause "ctdb ip" to generate a fake IP layout before
public IPs are setup. So, have the "ctdb ip" stub generate the IP
layout every time it is run to avoid it being stale.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Add new variables statd_callout_state_dir and statd_callout_queue_dir
- the latter is for files queued by add-client/del-client.
Use $statd_callout_queue_dir to avoid a global cd to the queue
directory near the top of the script.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
All of the other uses of ctdb.tdb are in statd-callout.
New variable statd_callout_db makes it easy to change the database
name in future, perhaps even allowing it to be configurable.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Tweak some lines to avoid overflowing 80 columns.
Best viewed with "git show -w".
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
This avoids a compilation error:
../../ctdb/protocol/protocol_util.c: In function ‘ctdb_connection_list_read’:
../../ctdb/protocol/protocol_util.c:787:9: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
787 | return ret;
| ^~~
Signed-off-by: Jo Sutton <josutton@catalyst.net.nz>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Using xargs with sort -u to de-duplicate this list was my idea and
causes a couple of things to go wrong. The use of xargs causes
double-quotes to be lost. The resulting $public_ifaces value also
contains newlines. The newlines could be removed with an additional
xargs at the end of the pipeline... but that would add an extra level
of quote stripping.
I have unsuccessfully tried to find an alternative, but still elegant,
command pipeline that de-duplicates the list, while maintaining
quoting.
So, just drop the de-duplication.
This might make interface_ifindex_exists_with_options() slightly less
efficient. However, that function walks the whole list, only
terminating early when a match is found on both interface and options,
so at least it will be correct.
Include an extra testcase.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Apr 18 09:08:34 UTC 2024 on atb-devel-224
This was an implementation of getline(3), use that instead.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <mschwenke@ddn.com>
This is the only user of common/line.[ch], which can go next.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <mschwenke@ddn.com>
get_all_interfaces() functions gets all names for all public interfaces.
However name is misleading. Thus renamed it to get_public_ifaces() and
moved it under functions.
Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com>
Reviewed-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
We have the same function in tevent, no need to duplicate code.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
nl->srvid is uint64_t, as is the srvid parameter of ctdb_daemon_send_message()
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Wed Mar 13 08:43:16 UTC 2024 on atb-devel-224
These were generated by 06.nfs.script.
Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com>
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Wed Mar 6 07:09:06 UTC 2024 on atb-devel-224
Event scripts run the "start_ipreallocate" hook in order to notice
that some ip addresses in the cluster potentially changed.
CTDB_SRVID_START_IPREALLOCATE gives C code a chance to get notified as well
once the event scripts are finished.
Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com>
Reviewed-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>