1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-25 06:04:04 +03:00

4897 Commits

Author SHA1 Message Date
Amitay Isaacs
500b26e48f common/system: Add ctdb_set_process_name() function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit fc3689c977f48d7988eed0654fb8e5ce4b8bfc8b)
2013-07-10 14:33:19 +10:00
Amitay Isaacs
4357aebdb9 traverse: Remove unused start_time field
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit dc834d5e78c3fb97ae15cddf1139b3c4a4051a7c)
2013-07-10 14:33:19 +10:00
Amitay Isaacs
bf3dd9488e traverse: Send records directly from traverse child to srcnode
Currently CTDB daemon reads records from a child process and then sends them to
srcnode via TRAVERSE_DATA control.  This ties up main CTDB daemon and also
requires an extra copy of the record in the CTDB daemon.  Instead send records
directly from traverse child process.

The control from child process still goes via local CTDB daemon as there
is no infrastructure currently to open a TCP socket to the srcnode.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 1a74192aa7d51ed99553e7292860027f06b6ef37)
2013-07-10 14:33:19 +10:00
Amitay Isaacs
557b92fc88 traverse: Pass reqid and srcnode information to local database traverse
So that traverse child process can directly send the TRAVERSE_DATA control to
the srcnode without first sending it to local node.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit faabce1b99fb3de9ff03bf54d303e7656538fee3)
2013-07-10 14:33:19 +10:00
Amitay Isaacs
3dcdd39801 packaging: When building with system libraries, add dependency for them
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 8225b3e77e140db34b52571a95d553d1e59e3f1e)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
d46c24f4d0 ctdbd: No need for DeadlockTimeout tunable
The code for deadlock detection and killing smbd process causing deadlock
has been removed and replaced with external debug script.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 2211cd94bea266547d3e6f167d3160a6b23bec88)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
ae0afad8ee initscript: Export CTDB_DEBUG_LOCKS variable
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit a415a1986900135f889efc25ecaf2761b1dae81a)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
f46d0e783c scripts: Add an example debug_locks.sh script to debug locking issue
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit c711ff4702c5f95b75e4bf030665fc2afffc2f9e)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
c620457c0b locking: Use external script to debug locking issues
Use an external script to parse /proc/locks and log useful debugging
information about locks rather than doing that in C code.

To use this feature, add configuration variable to /etc/sysconfig/ctdb:

  CTDB_DEBUG_LOCKS=/etc/ctdb/debug_locks.sh

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 2bfb8499366d530f16515b08928056bbda40f781)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
9ae379c91a locking: Update locking bucket intervals
0   < 1 ms
 1   < 10 ms
 2   < 100 ms
 3   < 1 s
 4   < 2 s
 5   < 4 s
 6   < 8 s
 7   < 16 s
 8   < 32 s
 9   < 64 s
10   >= 64 s

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 6fc36a7036933237d09151a0baf4d8ccd2bc2c99)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
1afb7fccb2 locking: Update locks latency in CTDB statistics only for RECORD or DB locks
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit dcc42a75b4638b3aa40c44ed9e0aaae26483e2b0)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
81e6d60f01 tools/ctdb: Fix the format of DB statistics output
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 594c421f90ce132c75fbd985872114e4967f92b5)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
d36aa928fd ctdbd: Remove incomplete ctdb_db_statistics_wire structure
Send the ctdb_db_statistics directly instead of first copying it to
duplicate ctdb_db_statistics_wire structure.  This simplifies the
implementation of the control to get database statistics.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
c0798dfb64 ctdbd: Update debug messages for setting readonly property on database
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 545a46437dfb2b755bb2fddb11dea8c4ccce3ed7)
2013-07-10 14:32:52 +10:00
Amitay Isaacs
bcb64aa55f recoverd: Fix buffer overflow error in reloadips
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 41182623891d74a7e9e9c453183411a161201e67)
2013-07-05 15:52:34 +10:00
Martin Schwenke
f92e49f6f8 tests/eventscripts: Add some rudimentary tests for 60.ganesha
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e1cf1f728236d808bb41265e74bc65f54bf1c133)
2013-07-05 15:52:34 +10:00
Martin Schwenke
d6d1fb1f46 eventscripts: New configuration variable $CTDB_SKIP_GANESHA_NFSD_CHECK
This allows 60.ganesha to be unit tested, except for the core Ganesha
monitoring code.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f606df4f2db754592e6d1a16c26e155cacb2beef)
2013-07-05 15:52:33 +10:00
Martin Schwenke
7f6169b207 eventscript: Move Ganesha nfsd monitoring to a function
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ceb5b2d37f7ab4894908ec26f3812b3bed991525)
2013-07-05 15:52:33 +10:00
Martin Schwenke
c3e83d4532 eventscripts: Drop RPC service version from nfs_check_rpc_service() calls
Support for this was removed in commit
77302dbfd85754e02559eccb2dd6c090db0b6b9f and I overlooked its use in
60.ganesha.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 520914e7ee1b879c1080e5857fda18ed5b973fd6)
2013-07-05 15:52:33 +10:00
Martin Schwenke
dcdae86dc7 ctdbd: Log something when releasing all IPs
At the moment this is silent and it can be confusing to see IPs just
disappear.

Also, this message:

  Been in recovery mode for too long. Dropping all IPS

can cause anxiety when all IPs should already have been dropped.
Adding a comforting message saying that 0 IPs were dropped relieves
such anxiety.  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4d0f26b306fc465d551d340b0e7dce4412eae3fd)
2013-07-05 15:52:33 +10:00
Martin Schwenke
0108e8ff10 recoverd: Minor style improvements for ctdb_reload_remote_public_ips()
* Add a variable to the loop to make the code more readable and have
  it generally fit into 80 columns.

* Improve comments.

* Improve log messages.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 0a292fa8939a1343e44cadaa8ed9f3c0f18ca82f)
2013-07-05 15:52:33 +10:00
Martin Schwenke
7290798a41 recoverd: Clean up log messages in remote IP verification
The log messages in verify_remote_ip_allocation() are confusing
because they don't include the PNN of the problem node, because it is
not known in this function.

Add the PNN of the node being verified as a function argument and then
shuffle the log messages around to make them clearer.

Also fold 3 nested if statements into just one.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f0942fa01cd422133fc9398f56b4855397d7bc86)
2013-07-05 15:52:33 +10:00
Martin Schwenke
15115becef recoverd: Fix an unclear log message - "Restart recovery process"
When the recovery master notices a node in recovery mode it starts the
recovery process, it doesn't restart it.

Update documentation to match.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 298c4d2c3b4ea3d900c91f5a0a5aca2952a13d61)
2013-07-05 15:52:33 +10:00
Martin Schwenke
bfe0b93652 recoverd: Fix an incorrect comment
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 9f6cd8b0bea619991c9f3bf35188c5950dabf8f4)
2013-07-05 15:52:33 +10:00
Martin Schwenke
9c8cc863f7 ctdbd: Use ctdb_die() on "setup" event failure
This is slightly easier to read because it all fits on 1 line.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 035bf3eecf99337c84d4ad16cdbf297b1fa037db)
2013-07-05 15:52:33 +10:00
Martin Schwenke
c327c91490 ctdbd: Avoid a core dump when "init" event fails
The "init" event only really fails in the scripts, which should log
something useful on failure.  Therefore, a core dump isn't terribly
useful and sometimes attracts unwanted attention.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 3af2d833b63af9931792106db71797f3692669a8)
2013-07-05 15:52:33 +10:00
Martin Schwenke
dbd1759eae util: New function ctdb_die()
This is like ctdb_fatal() but exits cleanly without dumping core or
generating a backtrace.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c0a9456692c88a7a5542cd893d8f326524d3f94e)
2013-07-05 15:52:33 +10:00
Martin Schwenke
4e07c6c433 eventscripts: When replaying monitor status, don't log empty output
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit ce04f1c107b4392ca955d9f29b93aaaae62439ce)
2013-07-05 15:52:33 +10:00
Martin Schwenke
26b161156a ctdbd: Release IP callback should fail if the IP is still hosted
At the moment there (at least) are 2 bugs that cause rogue IPs:

* A race where release_ip_callback() runs after a "subsequent" take IP
  has completed.  The IP is back on an interface but we unset
  vnn->iface in the callback.

* A "releaseip" eventscript times out.  We ignore the timeout and call
  it success, deleting the VNN even if the IP is still hosted.

  We could decide not to ignore the timeout and ban the node, but
  killing TCP connections can take a long time and that might result
  in a lot of manning.  We probably won't reinstate banning on
  "releaseip" until killing TCP connections has been optimised.

In both cases, a rogue IP can be avoided by leaving vnn->iface set and
simply failing the control.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit c5797f2942e83da24df548ea07196fbbac0eab20)
2013-07-05 15:52:32 +10:00
Martin Schwenke
793233f6b6 ctdbd: Log warnings in release IP when unexpected interface is encountered
Previous code changes work around a potential problems but do not
provide useful information when the a problem occurs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit f1f1b0c24b9b6cd24b83a4e4da16e179287ec6ac)
2013-07-05 15:52:32 +10:00
Amitay Isaacs
cc6772c968 ping_pong: Validate num_locks argument > 0
This fixes the floating point error if num_locks = 0.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 16afe36de52561a62372c14b567683dc898369d5)
2013-07-04 20:43:52 +10:00
Amitay Isaacs
cc3ffdbc1a tests: If connection to ctdb daemon fails, exit
This fixes the segmentation error if any of the test code fails to
connect to CTDB daemon.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit d48eecd748830598f4f080952f2bf05d6f92738c)
2013-07-04 20:43:52 +10:00
Amitay Isaacs
6391f61fbc build: Fix compiler warnings for uninitialized variables
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 5408c5c4050539e5aa06a5e82ceb63a6cb5cef0c)
2013-07-04 20:43:52 +10:00
Amitay Isaacs
f032c60cd5 recoverd: Send the result from child process only once
The result has been sent before the child keeps waiting for parent
ctdbd process.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 9aa13bcedd83d463c871e3cf1f3a65da3cd83992)
2013-07-04 20:43:52 +10:00
Amitay Isaacs
a11e8ab75a packaging: Enable compiler optimizations
This reverts d09570c70551aa40390ce9ceffe7bc234e1afafe.

... hoping the segv has been found in last 6 years. :-)

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 9b529189f8456fad7868fc154ae27a6fd87e93b3)
2013-07-04 20:43:52 +10:00
Amitay Isaacs
b169182ff2 packaging: Allow building RPMs with system tdb/talloc/tevent
To build CTDB RPMs with system installed libraries, use following command:

  ./packaging/RPM/makerpms.sh \
    --with system_talloc \
    --with system_tdb \
    --with system_tevent

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit bb54f3924ff19cd089b0a166fe8368db162ad709)
2013-07-04 20:41:51 +10:00
Amitay Isaacs
ae03a5e3ee packaging: Do not mark /etc/ctdb/functions as configuration file
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 1b0faae9c939a2f8da3cacba715ca62a5830d190)
2013-07-04 16:49:22 +10:00
Amitay Isaacs
71930e12b5 packaging: Install README.notify.d using %doc directive
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 53d34eb2f9e5434dea4e7182b6af566a3a96a368)
2013-07-04 16:49:15 +10:00
Amitay Isaacs
4a7f01f37e packaging: Install docs using %doc directive
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 6fe584d05543eebd24abd19bab502dc4da04e921)
2013-07-04 16:49:06 +10:00
Amitay Isaacs
dfa845151a packaging: Remove ctdb_transaction from docdir
It's bundled in ctdb-tests package.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 7e53fbf92b6dd5211d918ea0e23126b7dfa50c42)
2013-07-04 14:30:46 +10:00
Martin Schwenke
ab68cf3446 doc: Add a disclaimer for the EnableBans tunable
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 145b1966c1b34f1667a175235e1df2741294391c)
2013-07-04 14:30:18 +10:00
Martin Schwenke
0c5d2fb5a7 doc: Add banning bug fixes to NEWS
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b4c06e8ec8b227c1e6c01444038c3b15b5f9e606)
2013-07-04 14:30:02 +10:00
Amitay Isaacs
c944a589ca ctdbd: Don't ban self if init or shutdown event fails
There is no point in banning the node if init or shutdown event times
out since it's going to quit anyway.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit ef1c4e99ca66e7a990bc557f34abb624c315e6ba)
2013-07-02 12:59:09 +10:00
Amitay Isaacs
29adaae093 doc: The second half of monitoring is only for recovery master
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit fcd5e1f04c5fe6c98399429b8f0918b8779acba6)
2013-07-02 12:59:09 +10:00
Michael Adam
3c65197b7a recoverd: when the recmaster is banned, use that information when forcing an election
When we trigger an election because the recmaster considers itself inactive,
update our local nodemap with the recmaster's flags before calling
force_election(). This way, we don't send the inactive node freeze commands
(e.g.) that may fail and then lead to ourselves getting banned.

The theory is that this should help avoiding banning loops.

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 932360992b08a5483d90c0590218ba0fd756119e)
2013-07-02 12:59:09 +10:00
Michael Adam
082da536cb recoverd: fix a comment typo
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 741944f118e98f178b860194eecb215180949d18)
2013-07-02 12:59:09 +10:00
Michael Adam
159b9a2989 recoverd: fix a comment in main_loop
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit ac06c46e4a80c635f6094b5ac6f0bf3e3a02db95)
2013-07-02 12:59:09 +10:00
Michael Adam
26365f2a5f recoverd: eliminate some trailing spaces from ctdb_election_win()
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit df30c0a05ed908fc2a997c56ff5484736b23b70f)
2013-07-02 12:59:09 +10:00
Martin Schwenke
aa79a656a7 recoverd: Don't continue if the current node gets banned
Can not continue with recovery or monitoring cluster.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a)
2013-07-02 12:59:09 +10:00
Amitay Isaacs
b29b6ae39e recoverd: Refactor code to ban misbehaving nodes
Since we have nodemap information, there is no need to hardcode the
limit of 20.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe)
2013-07-02 12:59:09 +10:00