1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-22 13:34:15 +03:00
Commit Graph

2622 Commits

Author SHA1 Message Date
Stefan Metzmacher
c6602b686b ctdb: add/implement CTDB_CONTROL_TCP_CLIENT_DISCONNECTED
With multichannel a ctdb connection from smbd may hold multiple
tcp connections, which can be disconnected before the smbd
process terminates the whole ctdb connection, so we a
way to remove undo 'CTDB_CONTROL_TCP_CLIENT' again.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=15523

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2023-12-15 11:06:34 +00:00
Stefan Metzmacher
5f52d140f7 ctdb: make use of ctdb_canonicalize_ip_inplace() in ctdb_control_tcp_client()
We could also remove the src_addr and dest_addr helper variables
completely, but that would be too much for this commit.

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2023-12-15 11:06:34 +00:00
Stefan Metzmacher
92badd3bdd ctdb: remove unused ctdb->client_ip_list and print debug on ctdb_tcp_list instead
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15523

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2023-12-15 11:06:34 +00:00
Joseph Sutton
cec6c7e233 ctdb: Fix code spelling
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2023-12-08 02:28:33 +00:00
Joseph Sutton
265e3699ac ctdb: Remove trailing whitespace
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2023-12-08 02:28:33 +00:00
Volker Lendecke
a6b66661c7 ctdb: Add "home_nodes" file to deterministic IP allocation
With a file "home_nodes" next to "public_addresses" you can assign
public IPs to specific nodes when using the deterministic allocation
algorithm. Whenever the "home node" is up, the IP address will be
assigned to that node, independent of any other deterministic
calculation. The line

192.168.21.254 2

in the file "home_nodes" assigns the IP address to node 2. Only when
node 2 is not able to host IP addresses, 192.168.21.254 undergoes the
normal deterministic IP allocation algorithm.

Signed-off-by: Volker Lendecke <vl@samba.org>

add home_nodes
Reviewed-by: Ralph Boehme <slow@samba.org>

Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Tue Oct 10 14:17:19 UTC 2023 on atb-devel-224
2023-10-10 14:17:19 +00:00
Volker Lendecke
23ccb1c0ca ctdb: Align variable signedness
ipalloc_state->num_nodes is uint32_t
Reviewed-by: Ralph Boehme <slow@samba.org>
2023-10-10 13:14:31 +00:00
Martin Schwenke
8b9f464420 ctdb-daemon: Call setproctitle_init()
Commit 19c82c19c0 changed the behaviour
of prctl_set_comment() so it now calls setproctitle(3bsd) by default.

In some Linux distributions (e.g. Rocky Linux 8.8), this results in
messages like this spamming the logs:

  ctdbd: setproctitle not initialized, please either call setproctitle_init() or link against libbsd-ctor.

Most Samba daemons seem to call setproctitle_init(), so do it here.

In the longer term CTDB should also switch to using lib/util's
process_set_title(), like the rest of Samba, for more flexible process
names.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=15479

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Ralph Boehme <slow@samba.org>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Sep 21 00:46:50 UTC 2023 on atb-devel-224
2023-09-21 00:46:50 +00:00
Joseph Sutton
c62491473a ctdb: Fix code spelling
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2023-09-11 02:42:41 +00:00
Joseph Sutton
9769b594f4 ctdb: Add missing newline to logging message
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2023-08-08 04:39:37 +00:00
Joseph Sutton
a8085b3dd5 ctdb: Add missing newlines to logging messages
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2023-08-08 04:39:36 +00:00
Martin Schwenke
2e2d81b92a ctdb-recoverd: CID 1509028 - Use of 32-bit time_t (Y2K38_SAFETY)
usecs is going to be passed as a uint32_t.  There is no need to
calculate it as a time_t.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2023-07-19 09:01:33 +00:00
Martin Schwenke
61dfc8bc06 ctdb-server: Avoid logging a count of 0 resent calls
This fixes a little thinko in commit
80de84d36e, where this was overlooked.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Jul 10 15:15:06 UTC 2023 on atb-devel-224
2023-07-10 15:15:06 +00:00
Martin Schwenke
51d0445a7d ctdb-logging: Really make NOTICE the default debug level
NOTICE level debug messages in common/run_event.c are not logged by
default.

Currently eventd ends up using ERROR, since this is specified as
LOGGING_LOG_LEVEL_DEFAULT.  It doesn't inherit the debug level from
ctdbd and only uses NOTICE level when interactive.

Change the real logging default to NOTICE and use it everywhere.

Followups might be:

* Remove the default_log_level argument to logging_conf_init()
* Kick eventd to update debug level when "ctdb setdebug" is used

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2023-07-10 14:21:30 +00:00
Christof Schmitt
4dccf5afa4 ctdb-recovery: Use correct struct ban_node_state type for state
If this codepath is hit, ctdb aborts with:

ctdb/server/ctdb_recovery_helper.c:2687: Type mismatch: name[struct ban_node_state] expected[struct node_ban_state]")
    at ../../lib/talloc/talloc.c:505

Fix this by using the correct type.

Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Volker Lendecke <vl@samba.org>

Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Wed May  3 08:04:09 UTC 2023 on atb-devel-224
2023-05-03 08:04:09 +00:00
Andreas Schneider
7749df4992 ctdb:server: Fix code spelling
Best reviewed with: `git show --word-diff`

Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Martin Schwenke <mschwenke@ddn.com>
2023-03-24 07:01:31 +00:00
Andreas Schneider
19f418b68f ctdb:server: Remove trailing whitespaces in ctdb_server.c
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Martin Schwenke <mschwenke@ddn.com>
2023-03-24 07:01:31 +00:00
Andreas Schneider
59af504999 ctdb:server: Remove trailing whitespaces in ctdb_recover.c
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Martin Schwenke <mschwenke@ddn.com>
2023-03-24 07:01:31 +00:00
Michael Tokarev
96154a26fe spelling fixes for 4.18 (errror implemenation proces Controler)
One of changes is somewhat interesting, it is "tfork waiter proces"
process title in tfork.c. I wonder why no one noticed this before.
There's another similar process title in there, "tfork waiter process(%d)".
Hopefully no one does grep for "proces$" (and there's no reason to).

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Rowland Penny <rpenny@samba.org>

Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Thu Jan 26 20:46:11 UTC 2023 on atb-devel-224
2023-01-26 20:46:11 +00:00
Volker Lendecke
35ee3e0231 ctdb: Fix the build on FreeBSD
"basename" is define in libgen.h included from system/dir.h

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
2023-01-18 11:49:38 +00:00
Martin Schwenke
061315cc79 ctdb-mutex: Test the lock by locking a 2nd byte range
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-28 10:09:34 +00:00
Martin Schwenke
97a1714ee9 ctdb-mutex: open() and fstat() when testing lock file
This makes a file descriptor available for other I/O.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-28 10:09:34 +00:00
Martin Schwenke
c07e81abf0 ctdb-mutex: Factor out function fcntl_lock_fd()
Allows blocking mode and start offset to be specified.  Always locks a
1-byte range.

Make the lock structure static to avoid initialising the whole
structure each time.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-28 10:09:34 +00:00
Martin Schwenke
9daf22a5c9 ctdb-mutex: Handle pings from lock checking child to parent
The ping timeout is specified by passing an extra argument to the
mutex helper, representing the ping timeout in seconds.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-28 10:09:34 +00:00
Martin Schwenke
b5db286791 ctdb-mutex: Do inode checks in a child process
In future this will allow extra I/O tests and a timeout in the parent
to (hopefully) release the lock if the child gets wedged.  For
simplicity, use tmon only to detect when either parent or child goes
away.  Plumbing a timeout for pings from child to parent will be done
later.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-28 10:09:34 +00:00
Martin Schwenke
2ecdbcb22c ctdb-mutex: Rename wait_for_lost to lock_io_check
This will be generalised to do more I/O-based checks.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-28 10:09:34 +00:00
Martin Schwenke
7ab2e8f127 ctdb-mutex: Rename recheck_time to recheck_interval
There will be more timeouts so clarify the intent of this one.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-28 10:09:34 +00:00
Martin Schwenke
c396b61504 ctdb-mutex: Consistently use progname in error messages
To avoid error messages having ridiculously long paths, set progname
to basename(argv[0]).

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-28 10:09:34 +00:00
Martin Schwenke
3efa56aa61 ctdb-daemon: Fix printing of tickle ACKs
Commit f5a2037734 arguably got this
back-to-front:

  2022-07-27T09:50:01.985857+10:00 testn1 ctdbd[17820]: ../../ctdb/server/ctdb_takeover.c:514 sending TAKE_IP for '10.0.1.173'
  2022-07-27T09:50:01.990601+10:00 testn1 ctdbd[17820]: Send TCP tickle ACK: 10.0.1.77:33004 -> 10.0.1.173:2049
  2022-07-27T09:50:01.991323+10:00 testn1 ctdb-takeover[19758]: TAKEOVER_IP 10.0.1.173 succeeded on node 0

Unfortunately there is an inconsistency somewhere in the connection
tracking code used for tickle ACKs, making this less than obvious.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 28 09:02:08 UTC 2022 on sn-devel-184
2022-07-28 09:02:08 +00:00
Martin Schwenke
c77a4fde7a ctdb-daemon: Modernise debug in ctdb_add_public_address()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-22 16:09:31 +00:00
Martin Schwenke
d62fcba7dc ctdb-daemon: Avoid spurious error sending ARPs for released IP
A public IP address can be released in between (and probably before)
attempts to send ARPs.  One situation when this can occur is when a
cluster is shutting down: node A shuts down first, public IPs from
node A are taken over by node B, node B is shutdown.

Notice this when it occurs and cancel further attempts to send ARPs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-22 16:09:31 +00:00
Martin Schwenke
f5a2037734 ctdb-daemon: Modernise debug in ctdb_control_send_arp()
For the tickle ACK logging, render the connection in a buffer.  This
produces more complete information.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-22 16:09:31 +00:00
Martin Schwenke
9898e7c555 ctdb-recoverd: Clean up banning culprit code
Make this fully self-contained in the recovery daemon and avoid
indexing by PNN.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-22 16:09:31 +00:00
Martin Schwenke
19fbc2da38 ctdb-recoverd: Add pnn field to banning state structure
This structure is now standalone, so indexing by PNN can be avoided
via a subsequent commit.  Index by culprit here to make this commit
simple.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-22 16:09:31 +00:00
Martin Schwenke
0b5dd07604 ctdb-recoverd: Add function node_flags() and use it in elections
Indexing a node map by PNN is suboptimal.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-07-22 16:09:31 +00:00
Martin Schwenke
e752f841e6 ctdb-daemon: Use DEBUG() macro for child logging
Directly using dbgtext() with file logging results in a log entry with
no header, which is wrong.  This is a regression, introduced in commit
10d15c9e5d.  Prior to this, CTDB's
callback for file logging would always add a header.

Use DEBUG() instead dbgtext().  Note that DEBUG() effectively compares
the passed script_log_level with DEBUGLEVEL, so an explicit check is
no longer necessary.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=15090

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>

Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Thu Jun 16 13:33:10 UTC 2022 on sn-devel-184
2022-06-16 13:33:10 +00:00
Martin Schwenke
88f35cf862 ctdb-daemon: Drop unused prefix, logfn, logfn_private
These aren't set anywhere in the code.

Drop the log argument because it is also no longer used.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=15090

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
2022-06-16 12:42:35 +00:00
Martin Schwenke
90a96f06a9 ctdb-recoverd: Do not ban on unknown error when taking cluster lock
If the cluster filesystem is unavailable then I/O errors may occur.
This is no worse than contention, so don't ban.  This avoids having
services unavailable for longer than necessary.

Update the associated test to simply confirm that this results in a
leaderless cluster, and leadership is restored when the lock can once
again be taken.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-05-31 05:06:29 +00:00
Martin Schwenke
da9decfc5e ctdb-daemon: Remove unused #includes of rb_tree.h
ctdb_takeover.c and eventscript.c no longer use this.
ipalloc_common.c has never used it.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-05-31 05:06:29 +00:00
Martin Schwenke
80de84d36e ctdb-daemon: Log per-database summary of resent calls
After a recovery that takes a significant amount of time the logs are
flooded with messages about every resent call.

Log a summary instead and demote per-call messages to INFO level.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-05-31 05:06:29 +00:00
Pavel Filipenský
8cb6565011 ctdb: Covscan: unchecked return value for trbt_traversearray32()
Signed-off-by: Pavel Filipenský <pfilipen@redhat.com>
Reviewed-by: Jeremy Allison <jra@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
2022-05-14 03:49:32 +00:00
Martin Schwenke
d52b497d11 ctdb-locking: Don't pass NULL to tevent_req_is_unix_error()
If there is an error then this pointer is unconditionally
dereferenced.

However, the only possible error appears to be ENOMEM, where a crash
caused by dereferencing a NULL pointer isn't a terrible outcome.  In
the absence of a security issue this is probably not worth
backporting.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-05-03 09:19:31 +00:00
Martin Schwenke
490e5f4d4c ctdb-mutex: Don't pass NULL to tevent_req_is_unix_error()
If there is an error then this pointer is unconditionally
dereferenced.

However, the only possible error appears to be ENOMEM, where a crash
caused by dereferencing a NULL pointer isn't a terrible outcome.  In
the absence of a security issue this is probably not worth
backporting.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-05-03 09:19:31 +00:00
Martin Schwenke
6fb08a6580 ctdb-daemon: Don't release all public IPs during shutdown sequence
This further untangles public IP handling from the main daemon.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-04-06 06:34:37 +00:00
Martin Schwenke
f49446cb1e ctdb-daemon: Load tunables from ctdb.tunables
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-04-06 06:34:37 +00:00
Martin Schwenke
a509ee059e ctdb-daemon: New function ctdb_tunables_load()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-04-06 06:34:37 +00:00
Martin Schwenke
0e74e03c9c ctdb-recoverd: Consistently log start of election
Elections should now be quite rare, so always log when one begins.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-02-14 01:47:31 +00:00
Martin Schwenke
bf55a0117d ctdb-recoverd: Always send unknown leader broadcast when starting election
This is currently missed when the cluster lock is lost.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-02-14 01:47:31 +00:00
Martin Schwenke
9b3fab052b ctdb-recoverd: Consistently have caller set election-in-progress
The problem here is that election-in-progress must be set to
potentially avoid restarting the election broadcast timeout in
main_loop(), so this is already done by leader_handler().

Have force_election() set election-in-progress for all election types
and do not bother setting it in cluster_lock_election().

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-02-14 01:47:31 +00:00
Martin Schwenke
188a902156 ctdb-recoverd: Always cancel election in progress
Election-in-progress is set by unknown leader broadcast, so needs to
be cleared in all cases when election completes.

This was seen in a case where the leader node stalled, so didn't send
leader broadcasts for some time.  The node continued to hold the
cluster lock, so another node could not become leader.  However, after
the node returned to normal it still did not send leader broadcasts
because election-in-progress was never cleared.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-02-14 01:47:31 +00:00