mirror of
https://github.com/samba-team/samba.git
synced 2024-12-28 07:21:54 +03:00
128e2cb29d
Signed-off-by: Martin Schwenke <martin@meltin.net> Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c446579fc442955ecc74f5566eaa0635c3171498)
214 lines
7.1 KiB
Plaintext
214 lines
7.1 KiB
Plaintext
Changes in CTDB 2.4
|
|
===================
|
|
|
|
User-visible changes
|
|
--------------------
|
|
|
|
* A missing network interface now causes monitoring to fail and the
|
|
node to become unhealthy.
|
|
|
|
* Changed ctdb command's default control timeout from 3s to 10s.
|
|
|
|
* debug-hung-script.sh now includes the output of "ctdb scriptstatus"
|
|
to provide more information.
|
|
|
|
Important bug fixes
|
|
-------------------
|
|
|
|
* Starting CTDB daemon by running ctdbd directly should not remove
|
|
existing unix socket unconditionally.
|
|
|
|
* ctdbd once again successfully kills client processes on releasing
|
|
public IPs. It was checking for them as tracked child processes
|
|
and not finding them, so wasn't killing them.
|
|
|
|
* ctdbd_wrapper now exports CTDB_SOCKET so that child processes of
|
|
ctdbd (such as uses of ctdb in eventscripts) use the correct socket.
|
|
|
|
* Always use Jenkins hash when creating volatile databases. There
|
|
were a few places where TDBs would be attached with the wrong flags.
|
|
|
|
* Vacuuming code fixes in CTDB 2.2 introduced bugs in the new code
|
|
which led to header corruption for empty records. This resulted
|
|
in inconsistent headers on two nodes and a request for such a record
|
|
keeps bouncing between nodes indefinitely and logs "High hopcount"
|
|
messages in the log. This also caused performance degradation.
|
|
|
|
* ctdbd was losing log messages at shutdown because they weren't being
|
|
given time to flush. ctdbd now sleeps for a second during shutdown
|
|
to allow time to flush log messages.
|
|
|
|
* Improved socket handling introduced in CTDB 2.2 caused ctdbd to
|
|
process a large number of packets available on single FD before
|
|
polling other FDs. Use fixed size queue buffers to allow fair
|
|
scheduling across multiple FDs.
|
|
|
|
Important internal changes
|
|
--------------------------
|
|
|
|
* A node that fails to take/release multiple IPs will only incur a
|
|
single banning credit. This makes a brief failure less likely to
|
|
cause node to be banned.
|
|
|
|
* ctdb killtcp has been changed to read connections from stdin and
|
|
10.interface now uses this feature to improve the time taken to kill
|
|
connections.
|
|
|
|
* Improvements to hot records statistics in ctdb dbstatistics.
|
|
|
|
* Recovery daemon now assembles up-to-date node flags information
|
|
from remote nodes before checking if any flags are inconsistent and
|
|
forcing a recovery.
|
|
|
|
* ctdbd no longer creates multiple lock sub-processes for the same
|
|
key. This reduces the number of lock sub-processes substantially.
|
|
|
|
* Changed the nfsd RPC check failure policy to failover quickly
|
|
instead of trying to repair a node first by restarting NFS. Such
|
|
restarts would often hang if the cause of the RPC check failure was
|
|
the cluster filesystem or storage.
|
|
|
|
* Logging improvements relating to high hopcounts and sticky records.
|
|
|
|
* Make sure lower level tdb messages are logged correctly.
|
|
|
|
* CTDB commands disable/enable/stop/continue are now resilient to
|
|
individual control failures and retry in case of failures.
|
|
|
|
|
|
Changes in CTDB 2.3
|
|
===================
|
|
|
|
User-visible changes
|
|
--------------------
|
|
|
|
* 2 new configuration variables for 60.nfs eventscript:
|
|
|
|
- CTDB_MONITOR_NFS_THREAD_COUNT
|
|
- CTDB_NFS_DUMP_STUCK_THREADS
|
|
|
|
See ctdb.sysconfig for details.
|
|
|
|
* Removed DeadlockTimeout tunable. To enable debug of locking issues set
|
|
|
|
CTDB_DEBUG_LOCKS=/etc/ctdb/debug_locks.sh
|
|
|
|
* In overall statistics and database statistics, lock buckets have been
|
|
updated to use following timings:
|
|
|
|
< 1ms, < 10ms, < 100ms, < 1s, < 2s, < 4s, < 8s, < 16s, < 32s, < 64s, >= 64s
|
|
|
|
* Initscript is now simplified with most CTDB-specific functionality
|
|
split out to ctdbd_wrapper, which is used to start and stop ctdbd.
|
|
|
|
* Add systemd support.
|
|
|
|
* CTDB subprocesses are now given informative names to allow them to
|
|
be easily distinguished when using programs like "top" or "perf".
|
|
|
|
Important bug fixes
|
|
-------------------
|
|
|
|
* ctdb tool should not exit from a retry loop if a control times out
|
|
(e.g. under high load). This simple fix will stop an exit from the
|
|
retry loop on any error.
|
|
|
|
* When updating flags on all nodes, use the correct updated flags. This
|
|
should avoid wrong flag change messages in the logs.
|
|
|
|
* The recovery daemon will not ban other nodes if the current node
|
|
is banned.
|
|
|
|
* ctdb dbstatistics command now correctly outputs database statistics.
|
|
|
|
* Fixed a panic with overlapping shutdowns (regression in 2.2).
|
|
|
|
* Fixed 60.ganesha "monitor" event (regression in 2.2).
|
|
|
|
* Fixed a buffer overflow in the "reloadips" implementation.
|
|
|
|
* Fixed segmentation faults in ping_pong (called with incorrect
|
|
argument) and test binaries (called when ctdbd not running).
|
|
|
|
Important internal changes
|
|
--------------------------
|
|
|
|
* The recovery daemon on stopped or banned node will stop participating in any
|
|
cluster activity.
|
|
|
|
* Improve cluster wide database traverse by sending the records directly from
|
|
traverse child process to requesting node.
|
|
|
|
* TDB checking and dropping of all IPs moved from initscript to "init"
|
|
event in 00.ctdb.
|
|
|
|
* To avoid "rogue IPs" the release IP callback now fails if the
|
|
released IP is still present on an interface.
|
|
|
|
|
|
Changes in CTDB 2.2
|
|
===================
|
|
|
|
User-visible changes
|
|
--------------------
|
|
|
|
* The "stopped" event has been removed.
|
|
|
|
The "ipreallocated" event is now run when a node is stopped. Use
|
|
this instead of "stopped".
|
|
|
|
* New --pidfile option for ctdbd, used by initscript
|
|
|
|
* The 60.nfs eventscript now uses configuration files in
|
|
/etc/ctdb/nfs-rpc-checks.d/ for timeouts and actions instead of
|
|
hardcoding them into the script.
|
|
|
|
* Notification handler scripts can now be dropped into /etc/ctdb/notify.d/.
|
|
|
|
* The NoIPTakeoverOnDisabled tunable has been renamed to
|
|
NoIPHostOnAllDisabled and now works properly when set on individual
|
|
nodes.
|
|
|
|
* New ctdb subcommand "runstate" prints the current internal runstate.
|
|
Runstates are used for serialising startup.
|
|
|
|
Important bug fixes
|
|
-------------------
|
|
|
|
* The Unix domain socket is now set to non-blocking after the
|
|
connection succeeds. This avoids connections failing with EAGAIN
|
|
and not being retried.
|
|
|
|
* Fetching from the log ringbuffer now succeeds if the buffer is full.
|
|
|
|
* Fix a severe recovery bug that can lead to data corruption for SMB clients.
|
|
|
|
* The statd-callout script now runs as root via sudo.
|
|
|
|
* "ctdb delip" no longer fails if it is unable to move the IP.
|
|
|
|
* A race in the ctdb tool's ipreallocate code was fixed. This fixes
|
|
potential bugs in the "disable", "enable", "stop", "continue",
|
|
"ban", "unban", "ipreallocate" and "sync" commands.
|
|
|
|
* The monitor cancellation code could sometimes hang indefinitely.
|
|
This could cause "ctdb stop" and "ctdb shutdown" to fail.
|
|
|
|
Important internal changes
|
|
--------------------------
|
|
|
|
* The socket I/O handling has been optimised to improve performance.
|
|
|
|
* IPs will not be assigned to nodes during CTDB initialisation. They
|
|
will only be assigned to nodes that are in the "running" runstate.
|
|
|
|
* Improved database locking code. One improvement is to use a
|
|
standalone locking helper executable - the avoids creating many
|
|
forked copies of ctdbd and potentially running a node out of memory.
|
|
|
|
* New control CTDB_CONTROL_IPREALLOCATED is now used to generate
|
|
"ipreallocated" events.
|
|
|
|
* Message handlers are now indexed, providing a significant
|
|
performance improvement.
|