mirror of
https://github.com/samba-team/samba.git
synced 2024-12-24 21:34:56 +03:00
2f5d9be754
Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>
355 lines
12 KiB
Plaintext
355 lines
12 KiB
Plaintext
Changes in CTDB 2.5.1
|
|
=====================
|
|
|
|
Important bug fixes
|
|
-------------------
|
|
|
|
* The locking code now correctly implements a per-database active
|
|
locks limit. Whole database lock requests can no longer be denied
|
|
because there are too many active locks - this is particularly
|
|
important for freezing databases during recovery.
|
|
|
|
* The debug_locks.sh script locks against itself. If it is already
|
|
running then subsequent invocations will exit immediately.
|
|
|
|
* ctdb tool commands that operate on databases now work correctly when
|
|
a database ID is given.
|
|
|
|
* Various code fixes for issues found by Coverity.
|
|
|
|
Important internal changes
|
|
--------------------------
|
|
|
|
* statd-callout has been updated so that statd client information is
|
|
always up-to-date across the cluster. This is implemented by
|
|
storing the client information in a persistent database using a new
|
|
"ctdb ptrans" command.
|
|
|
|
* The transaction code for persistent databases now retries until it
|
|
is able to take the transaction lock. This makes the transation
|
|
semantics compatible with Samba's implementation.
|
|
|
|
* Locking helpers are created with vfork(2) instead of fork(2),
|
|
providing a performance improvement.
|
|
|
|
* config.guess has been updated to the latest upstream version so CTDB
|
|
should build on more platforms.
|
|
|
|
|
|
Changes in CTDB 2.5
|
|
===================
|
|
|
|
User-visible changes
|
|
--------------------
|
|
|
|
* The default location of the ctdbd socket is now:
|
|
|
|
/var/run/ctdb/ctdbd.socket
|
|
|
|
If you currently set CTDB_SOCKET in configuration then unsetting it
|
|
will probably do what you want.
|
|
|
|
* The default location of CTDB TDB databases is now:
|
|
|
|
/var/lib/ctdb
|
|
|
|
If you only set CTDB_DBDIR (to the old default of /var/ctdb) then
|
|
you probably want to move your databases to /var/lib/ctdb, drop your
|
|
setting of CTDB_DBDIR and just use the default.
|
|
|
|
To maintain the database files in /var/ctdb you will need to set
|
|
CTDB_DBDIR, CTDB_DBDIR_PERSISTENT and CTDB_DBDIR_STATE, since all of
|
|
these have moved.
|
|
|
|
* Use of CTDB_OPTIONS to set ctdbd command-line options is no longer
|
|
supported. Please use individual configuration variables instead.
|
|
|
|
* Obsolete tunables VacuumDefaultInterval, VacuumMinInterval and
|
|
VacuumMaxInterval have been removed. Setting them had no effect but
|
|
if you now try to set them in a configuration files via CTDB_SET_X=Y
|
|
then CTDB will not start.
|
|
|
|
* Much improved manual pages. Added new manpages ctdb(7),
|
|
ctdbd.conf(5), ctdb-tunables(7). Still some work to do.
|
|
|
|
* Most CTDB-specific configuration can now be set in
|
|
/etc/ctdb/ctdbd.conf.
|
|
|
|
This avoids cluttering distribution-specific configuration files,
|
|
such as /etc/sysconfig/ctdb. It also means that we can say: see
|
|
ctdbd.conf(5) for more details. :-)
|
|
|
|
* Configuration variable NFS_SERVER_MODE is deprecated and has been
|
|
replaced by CTDB_NFS_SERVER_MODE. See ctdbd.conf(5) for more
|
|
details.
|
|
|
|
* "ctdb reloadips" is much improved and should be used for reloading
|
|
the public IP configuration.
|
|
|
|
This commands attempts to yield much more predictable IP allocations
|
|
than using sequences of delip and addip commands. See ctdb(1) for
|
|
details.
|
|
|
|
* Ability to pass comma-separated string to ctdb(1) tool commands via
|
|
the -n option is now documented and works for most commands. See
|
|
ctdb(1) for details.
|
|
|
|
* "ctdb rebalancenode" is now a debugging command and should not be
|
|
used in normal operation. See ctdb(1) for details.
|
|
|
|
* "ctdb ban 0" is now invalid.
|
|
|
|
This was documented as causing a permanent ban. However, this was
|
|
not implemented and caused an "unban" instead. To avoid confusion,
|
|
0 is now an invalid ban duration. To administratively "ban" a node
|
|
use "ctdb stop" instead.
|
|
|
|
* The systemd configuration now puts the PID file in /run/ctdb (rather
|
|
than /run/ctdbd) for consistency with the initscript and other uses
|
|
of /var/run/ctdb.
|
|
|
|
Important bug fixes
|
|
-------------------
|
|
|
|
* Traverse regression fixed.
|
|
|
|
* The default recovery method for persistent databases has been
|
|
changed to use database sequence numbers instead of doing
|
|
record-by-record recovery (using record sequence numbers). This
|
|
fixes issues including registry corruption.
|
|
|
|
* Banned nodes are no longer told to run the "ipreallocated" event
|
|
during a takeover run, when in fallback mode with nodes that don't
|
|
support the IPREALLOCATED control.
|
|
|
|
Important internal changes
|
|
--------------------------
|
|
|
|
* Persistent transactions are now compatible with Samba and work
|
|
reliably.
|
|
|
|
* The recovery master role has been made more stable by resetting the
|
|
priority time each time a node becomes inactive. This means that
|
|
nodes that are active for a long time are more likely to retain the
|
|
recovery master role.
|
|
|
|
* The incomplete libctdb library has been removed.
|
|
|
|
* Test suite now starts ctdbd with the --sloppy-start option to speed
|
|
up startup. However, this should not be done in production.
|
|
|
|
|
|
Changes in CTDB 2.4
|
|
===================
|
|
|
|
User-visible changes
|
|
--------------------
|
|
|
|
* A missing network interface now causes monitoring to fail and the
|
|
node to become unhealthy.
|
|
|
|
* Changed ctdb command's default control timeout from 3s to 10s.
|
|
|
|
* debug-hung-script.sh now includes the output of "ctdb scriptstatus"
|
|
to provide more information.
|
|
|
|
Important bug fixes
|
|
-------------------
|
|
|
|
* Starting CTDB daemon by running ctdbd directly should not remove
|
|
existing unix socket unconditionally.
|
|
|
|
* ctdbd once again successfully kills client processes on releasing
|
|
public IPs. It was checking for them as tracked child processes
|
|
and not finding them, so wasn't killing them.
|
|
|
|
* ctdbd_wrapper now exports CTDB_SOCKET so that child processes of
|
|
ctdbd (such as uses of ctdb in eventscripts) use the correct socket.
|
|
|
|
* Always use Jenkins hash when creating volatile databases. There
|
|
were a few places where TDBs would be attached with the wrong flags.
|
|
|
|
* Vacuuming code fixes in CTDB 2.2 introduced bugs in the new code
|
|
which led to header corruption for empty records. This resulted
|
|
in inconsistent headers on two nodes and a request for such a record
|
|
keeps bouncing between nodes indefinitely and logs "High hopcount"
|
|
messages in the log. This also caused performance degradation.
|
|
|
|
* ctdbd was losing log messages at shutdown because they weren't being
|
|
given time to flush. ctdbd now sleeps for a second during shutdown
|
|
to allow time to flush log messages.
|
|
|
|
* Improved socket handling introduced in CTDB 2.2 caused ctdbd to
|
|
process a large number of packets available on single FD before
|
|
polling other FDs. Use fixed size queue buffers to allow fair
|
|
scheduling across multiple FDs.
|
|
|
|
Important internal changes
|
|
--------------------------
|
|
|
|
* A node that fails to take/release multiple IPs will only incur a
|
|
single banning credit. This makes a brief failure less likely to
|
|
cause node to be banned.
|
|
|
|
* ctdb killtcp has been changed to read connections from stdin and
|
|
10.interface now uses this feature to improve the time taken to kill
|
|
connections.
|
|
|
|
* Improvements to hot records statistics in ctdb dbstatistics.
|
|
|
|
* Recovery daemon now assembles up-to-date node flags information
|
|
from remote nodes before checking if any flags are inconsistent and
|
|
forcing a recovery.
|
|
|
|
* ctdbd no longer creates multiple lock sub-processes for the same
|
|
key. This reduces the number of lock sub-processes substantially.
|
|
|
|
* Changed the nfsd RPC check failure policy to failover quickly
|
|
instead of trying to repair a node first by restarting NFS. Such
|
|
restarts would often hang if the cause of the RPC check failure was
|
|
the cluster filesystem or storage.
|
|
|
|
* Logging improvements relating to high hopcounts and sticky records.
|
|
|
|
* Make sure lower level tdb messages are logged correctly.
|
|
|
|
* CTDB commands disable/enable/stop/continue are now resilient to
|
|
individual control failures and retry in case of failures.
|
|
|
|
|
|
Changes in CTDB 2.3
|
|
===================
|
|
|
|
User-visible changes
|
|
--------------------
|
|
|
|
* 2 new configuration variables for 60.nfs eventscript:
|
|
|
|
- CTDB_MONITOR_NFS_THREAD_COUNT
|
|
- CTDB_NFS_DUMP_STUCK_THREADS
|
|
|
|
See ctdb.sysconfig for details.
|
|
|
|
* Removed DeadlockTimeout tunable. To enable debug of locking issues set
|
|
|
|
CTDB_DEBUG_LOCKS=/etc/ctdb/debug_locks.sh
|
|
|
|
* In overall statistics and database statistics, lock buckets have been
|
|
updated to use following timings:
|
|
|
|
< 1ms, < 10ms, < 100ms, < 1s, < 2s, < 4s, < 8s, < 16s, < 32s, < 64s, >= 64s
|
|
|
|
* Initscript is now simplified with most CTDB-specific functionality
|
|
split out to ctdbd_wrapper, which is used to start and stop ctdbd.
|
|
|
|
* Add systemd support.
|
|
|
|
* CTDB subprocesses are now given informative names to allow them to
|
|
be easily distinguished when using programs like "top" or "perf".
|
|
|
|
Important bug fixes
|
|
-------------------
|
|
|
|
* ctdb tool should not exit from a retry loop if a control times out
|
|
(e.g. under high load). This simple fix will stop an exit from the
|
|
retry loop on any error.
|
|
|
|
* When updating flags on all nodes, use the correct updated flags. This
|
|
should avoid wrong flag change messages in the logs.
|
|
|
|
* The recovery daemon will not ban other nodes if the current node
|
|
is banned.
|
|
|
|
* ctdb dbstatistics command now correctly outputs database statistics.
|
|
|
|
* Fixed a panic with overlapping shutdowns (regression in 2.2).
|
|
|
|
* Fixed 60.ganesha "monitor" event (regression in 2.2).
|
|
|
|
* Fixed a buffer overflow in the "reloadips" implementation.
|
|
|
|
* Fixed segmentation faults in ping_pong (called with incorrect
|
|
argument) and test binaries (called when ctdbd not running).
|
|
|
|
Important internal changes
|
|
--------------------------
|
|
|
|
* The recovery daemon on stopped or banned node will stop participating in any
|
|
cluster activity.
|
|
|
|
* Improve cluster wide database traverse by sending the records directly from
|
|
traverse child process to requesting node.
|
|
|
|
* TDB checking and dropping of all IPs moved from initscript to "init"
|
|
event in 00.ctdb.
|
|
|
|
* To avoid "rogue IPs" the release IP callback now fails if the
|
|
released IP is still present on an interface.
|
|
|
|
|
|
Changes in CTDB 2.2
|
|
===================
|
|
|
|
User-visible changes
|
|
--------------------
|
|
|
|
* The "stopped" event has been removed.
|
|
|
|
The "ipreallocated" event is now run when a node is stopped. Use
|
|
this instead of "stopped".
|
|
|
|
* New --pidfile option for ctdbd, used by initscript
|
|
|
|
* The 60.nfs eventscript now uses configuration files in
|
|
/etc/ctdb/nfs-rpc-checks.d/ for timeouts and actions instead of
|
|
hardcoding them into the script.
|
|
|
|
* Notification handler scripts can now be dropped into /etc/ctdb/notify.d/.
|
|
|
|
* The NoIPTakeoverOnDisabled tunable has been renamed to
|
|
NoIPHostOnAllDisabled and now works properly when set on individual
|
|
nodes.
|
|
|
|
* New ctdb subcommand "runstate" prints the current internal runstate.
|
|
Runstates are used for serialising startup.
|
|
|
|
Important bug fixes
|
|
-------------------
|
|
|
|
* The Unix domain socket is now set to non-blocking after the
|
|
connection succeeds. This avoids connections failing with EAGAIN
|
|
and not being retried.
|
|
|
|
* Fetching from the log ringbuffer now succeeds if the buffer is full.
|
|
|
|
* Fix a severe recovery bug that can lead to data corruption for SMB clients.
|
|
|
|
* The statd-callout script now runs as root via sudo.
|
|
|
|
* "ctdb delip" no longer fails if it is unable to move the IP.
|
|
|
|
* A race in the ctdb tool's ipreallocate code was fixed. This fixes
|
|
potential bugs in the "disable", "enable", "stop", "continue",
|
|
"ban", "unban", "ipreallocate" and "sync" commands.
|
|
|
|
* The monitor cancellation code could sometimes hang indefinitely.
|
|
This could cause "ctdb stop" and "ctdb shutdown" to fail.
|
|
|
|
Important internal changes
|
|
--------------------------
|
|
|
|
* The socket I/O handling has been optimised to improve performance.
|
|
|
|
* IPs will not be assigned to nodes during CTDB initialisation. They
|
|
will only be assigned to nodes that are in the "running" runstate.
|
|
|
|
* Improved database locking code. One improvement is to use a
|
|
standalone locking helper executable - the avoids creating many
|
|
forked copies of ctdbd and potentially running a node out of memory.
|
|
|
|
* New control CTDB_CONTROL_IPREALLOCATED is now used to generate
|
|
"ipreallocated" events.
|
|
|
|
* Message handlers are now indexed, providing a significant
|
|
performance improvement.
|