1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-28 07:21:54 +03:00
samba-mirror/ctdb/NEWS
Martin Schwenke 128e2cb29d doc: Update NEWS
Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit c446579fc442955ecc74f5566eaa0635c3171498)
2013-08-22 18:07:49 +10:00

214 lines
7.1 KiB
Plaintext

Changes in CTDB 2.4
===================
User-visible changes
--------------------
* A missing network interface now causes monitoring to fail and the
node to become unhealthy.
* Changed ctdb command's default control timeout from 3s to 10s.
* debug-hung-script.sh now includes the output of "ctdb scriptstatus"
to provide more information.
Important bug fixes
-------------------
* Starting CTDB daemon by running ctdbd directly should not remove
existing unix socket unconditionally.
* ctdbd once again successfully kills client processes on releasing
public IPs. It was checking for them as tracked child processes
and not finding them, so wasn't killing them.
* ctdbd_wrapper now exports CTDB_SOCKET so that child processes of
ctdbd (such as uses of ctdb in eventscripts) use the correct socket.
* Always use Jenkins hash when creating volatile databases. There
were a few places where TDBs would be attached with the wrong flags.
* Vacuuming code fixes in CTDB 2.2 introduced bugs in the new code
which led to header corruption for empty records. This resulted
in inconsistent headers on two nodes and a request for such a record
keeps bouncing between nodes indefinitely and logs "High hopcount"
messages in the log. This also caused performance degradation.
* ctdbd was losing log messages at shutdown because they weren't being
given time to flush. ctdbd now sleeps for a second during shutdown
to allow time to flush log messages.
* Improved socket handling introduced in CTDB 2.2 caused ctdbd to
process a large number of packets available on single FD before
polling other FDs. Use fixed size queue buffers to allow fair
scheduling across multiple FDs.
Important internal changes
--------------------------
* A node that fails to take/release multiple IPs will only incur a
single banning credit. This makes a brief failure less likely to
cause node to be banned.
* ctdb killtcp has been changed to read connections from stdin and
10.interface now uses this feature to improve the time taken to kill
connections.
* Improvements to hot records statistics in ctdb dbstatistics.
* Recovery daemon now assembles up-to-date node flags information
from remote nodes before checking if any flags are inconsistent and
forcing a recovery.
* ctdbd no longer creates multiple lock sub-processes for the same
key. This reduces the number of lock sub-processes substantially.
* Changed the nfsd RPC check failure policy to failover quickly
instead of trying to repair a node first by restarting NFS. Such
restarts would often hang if the cause of the RPC check failure was
the cluster filesystem or storage.
* Logging improvements relating to high hopcounts and sticky records.
* Make sure lower level tdb messages are logged correctly.
* CTDB commands disable/enable/stop/continue are now resilient to
individual control failures and retry in case of failures.
Changes in CTDB 2.3
===================
User-visible changes
--------------------
* 2 new configuration variables for 60.nfs eventscript:
- CTDB_MONITOR_NFS_THREAD_COUNT
- CTDB_NFS_DUMP_STUCK_THREADS
See ctdb.sysconfig for details.
* Removed DeadlockTimeout tunable. To enable debug of locking issues set
CTDB_DEBUG_LOCKS=/etc/ctdb/debug_locks.sh
* In overall statistics and database statistics, lock buckets have been
updated to use following timings:
< 1ms, < 10ms, < 100ms, < 1s, < 2s, < 4s, < 8s, < 16s, < 32s, < 64s, >= 64s
* Initscript is now simplified with most CTDB-specific functionality
split out to ctdbd_wrapper, which is used to start and stop ctdbd.
* Add systemd support.
* CTDB subprocesses are now given informative names to allow them to
be easily distinguished when using programs like "top" or "perf".
Important bug fixes
-------------------
* ctdb tool should not exit from a retry loop if a control times out
(e.g. under high load). This simple fix will stop an exit from the
retry loop on any error.
* When updating flags on all nodes, use the correct updated flags. This
should avoid wrong flag change messages in the logs.
* The recovery daemon will not ban other nodes if the current node
is banned.
* ctdb dbstatistics command now correctly outputs database statistics.
* Fixed a panic with overlapping shutdowns (regression in 2.2).
* Fixed 60.ganesha "monitor" event (regression in 2.2).
* Fixed a buffer overflow in the "reloadips" implementation.
* Fixed segmentation faults in ping_pong (called with incorrect
argument) and test binaries (called when ctdbd not running).
Important internal changes
--------------------------
* The recovery daemon on stopped or banned node will stop participating in any
cluster activity.
* Improve cluster wide database traverse by sending the records directly from
traverse child process to requesting node.
* TDB checking and dropping of all IPs moved from initscript to "init"
event in 00.ctdb.
* To avoid "rogue IPs" the release IP callback now fails if the
released IP is still present on an interface.
Changes in CTDB 2.2
===================
User-visible changes
--------------------
* The "stopped" event has been removed.
The "ipreallocated" event is now run when a node is stopped. Use
this instead of "stopped".
* New --pidfile option for ctdbd, used by initscript
* The 60.nfs eventscript now uses configuration files in
/etc/ctdb/nfs-rpc-checks.d/ for timeouts and actions instead of
hardcoding them into the script.
* Notification handler scripts can now be dropped into /etc/ctdb/notify.d/.
* The NoIPTakeoverOnDisabled tunable has been renamed to
NoIPHostOnAllDisabled and now works properly when set on individual
nodes.
* New ctdb subcommand "runstate" prints the current internal runstate.
Runstates are used for serialising startup.
Important bug fixes
-------------------
* The Unix domain socket is now set to non-blocking after the
connection succeeds. This avoids connections failing with EAGAIN
and not being retried.
* Fetching from the log ringbuffer now succeeds if the buffer is full.
* Fix a severe recovery bug that can lead to data corruption for SMB clients.
* The statd-callout script now runs as root via sudo.
* "ctdb delip" no longer fails if it is unable to move the IP.
* A race in the ctdb tool's ipreallocate code was fixed. This fixes
potential bugs in the "disable", "enable", "stop", "continue",
"ban", "unban", "ipreallocate" and "sync" commands.
* The monitor cancellation code could sometimes hang indefinitely.
This could cause "ctdb stop" and "ctdb shutdown" to fail.
Important internal changes
--------------------------
* The socket I/O handling has been optimised to improve performance.
* IPs will not be assigned to nodes during CTDB initialisation. They
will only be assigned to nodes that are in the "running" runstate.
* Improved database locking code. One improvement is to use a
standalone locking helper executable - the avoids creating many
forked copies of ctdbd and potentially running a node out of memory.
* New control CTDB_CONTROL_IPREALLOCATED is now used to generate
"ipreallocated" events.
* Message handlers are now indexed, providing a significant
performance improvement.