1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-24 21:34:56 +03:00
Commit Graph

6206 Commits

Author SHA1 Message Date
Amitay Isaacs
c51b8c2234 ctdb-recovery-helper: Add banning to parallel recovery
If one or more nodes are misbehaving during recovery, keep track of
failures as ban_credits.  If the node with the highest ban_credits exceeds
5 ban credits, then tell recovery daemon to assign banning credits.

This will ban only a single node at a time in case of recovery failure.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Mar 25 06:57:32 CET 2016 on sn-devel-144
2016-03-25 06:57:32 +01:00
Amitay Isaacs
ae366fb932 ctdb-recoverd: Add message handler to assigning banning credits
This will be called from recovery helper to assign banning credits to
misbehaving node.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:16 +01:00
Amitay Isaacs
fc63eae80b ctdb-protocol: Add srvid for assigning banning credits
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:16 +01:00
Amitay Isaacs
ad7a407a13 ctdb-recovery-helper: Introduce new #define variable
... instead of hardcoding number of retries.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:16 +01:00
Amitay Isaacs
e5a714a3c2 ctdb-recovery-helper: Improve log message
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:16 +01:00
Amitay Isaacs
a7b8ee87fe ctdb-tests: Add a test for recovery of large databases
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
ffea827bae ctdb-recovery-helper: Introduce push database abstraction
This abstraction uses capabilities of the remote nodes to either send
older PUSH_DB controls or newer DB_PUSH_START and DB_PUSH_CONFIRM
controls.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
b96a4759b3 ctdb-recovery-helper: Introduce pull database abstraction
This abstraction depending on the capability of the remote node either
uses older PULL_DB control or newer DB_PULL control.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
bb6541b386 ctdb-protocol: Add new capability
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
c5776f0529 ctdb-protocol: Add srvid for messages during recovery
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
b1e8714bb8 ctdb-protocol: Introduce variable for checking srvid prefix
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
e1fdfdd1c1 ctdb-recovery-helper: Write recovery records to a recovery file
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
9058fe06df ctdb-recovery-helper: Re-factor function to retain records from recdb
Also, rename traverse function and traverse state for recdb_records
consistently.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
76f653f0bc ctdb-protocol: Add file IO functions for ctdb_rec_buffer
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
a80ff09ed3 ctdb-recovery-helper: Create accessors for recdb structure fields
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
70011a1bfb ctdb-recovery-helper: Rename pnn to dmaster in recdb_records()
This variable is used to set the dmaster value for each record in
recdb_traverse().

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
5b926d882e ctdb-recovery-helper: Pass capabilities to database recovery functions
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
5f43f92796 ctdb-recovery-helper: Factor out generic recv function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
338e0dccd9 ctdb-client: Add client API functions for new controls
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
95a15cde45 ctdb-daemon: Implement new controls DB_PULL and DB_PUSH_START/DB_PUSH_CONFIRM
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:15 +01:00
Amitay Isaacs
0fd156ae84 ctdb-protocol: Add new controls DB_PULL and DB_PUSH_START/DB_PUSH_CONFIRM
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:14 +01:00
Amitay Isaacs
fe69b72569 ctdb-protocol: Add new data type ctdb_pulldb_ext for new control
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:14 +01:00
Amitay Isaacs
c41808e6d2 ctdb-tunables: Add new tunable RecBufferSizeLimit
This will be used to limit the size of record buffer sent in newer
controls for recovery and existing controls for vacuuming.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:14 +01:00
Amitay Isaacs
67799c73af ctdb-client: Add client API for sending message to multiple nodes
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-25 03:26:14 +01:00
Martin Schwenke
e4f592539d ctdb-daemon: Replace an unsafe strcpy(3) call
Tweak another strncpy(3) call.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Jeremy Allison <jra@samba.org>
2016-03-22 00:23:20 +01:00
Martin Schwenke
0ffa5d8d9e ctdb-daemon: Validate length of new interface names
Interface names that are too long will be truncated by strncpy(3)
later on.  It is better to validate the length of each new interface
name to ensure it will be usable.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Jeremy Allison <jra@samba.org>
2016-03-22 00:23:20 +01:00
Volker Lendecke
deaab95b8d ctdb: Fix CID 1356313 Explicit null dereferenced
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2016-03-18 00:29:14 +01:00
Amitay Isaacs
bcb671421b ctdb-tests: Add a utility to parse ctdb packets
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Mar 17 13:56:41 CET 2016 on sn-devel-144
2016-03-17 13:56:41 +01:00
Amitay Isaacs
6cdb927e76 ctdb-protocol: Add protocol debug routines
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-17 10:17:47 +01:00
Amitay Isaacs
0fa2853ce1 ctdb-protocol: Check header is not null before copying
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-17 10:17:47 +01:00
Amitay Isaacs
ad5b9c3df2 ctdb-client: Increase the timeout for TRANS3_COMMIT control
On a busy system, TRANS3_COMMIT control can take upto or longer than
3 seconds.  On timeout, there are few possible outcomes.

1. The transaction has completed on all nodes and TRANS3_COMMIT control
   has returned.  In such a case, there is no problem.

2. The transaction has completed on the local node, but TRANS3_COMMIT
   control is still active.  In such a case, ctdb_transaction_commit()
   can return successfully.  If this is being called from ctdb, then
   ctdb will exit.  This will cause ctdb daemon to trigger recovery
   since the client exited while transaction is active.  This will cause
   unnecessary recovery.

3. Database recovery was started and ctdb_transaction_commit() will
   retry till the recovery completes the transaction.

Increasing the timeout to 30 seconds will avoid the spurious database
recoveries when TRANS3_COMMIT control takes longer to finish.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>

Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Fri Mar 11 19:59:53 CET 2016 on sn-devel-144
2016-03-11 19:59:53 +01:00
Martin Schwenke
fa8bd41009 ctdb-tunables: Mark tunable DeferredRebalanceOnNodeAdd obsolete
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Mar 10 06:51:46 CET 2016 on sn-devel-144
2016-03-10 06:51:46 +01:00
Martin Schwenke
c9e69a4b2e ctdb-recoverd: Drop use of DeferredRebalanceOnNodeAdd tunable
If set, this was used to setup an IP takeover run on a timer after
certain updates to the public IP address configuration (e.g. "ctdb
addip").

However, "ctdb reloadips" completely manages public IP reconfiguration
and avoids the anomalies that DeferredRebalanceOnNodeAdd was
introduced to work around.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2016-03-10 03:34:19 +01:00
Martin Schwenke
d678684695 ctdb-tools: Drop "ctdb rebalancenode"
This was a workaround for trying to ensure public IP addresses are
properly rebalanced after running "ctdb addip" on multiple nodes.
"ctdb reloadips" is a better solution.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2016-03-10 03:34:19 +01:00
Martin Schwenke
aaa57fbcb3 ctdb-tools: Drop "ctdb rebalanceip"
This is undocumented and is not needed.  It was a workaround for
trying to ensure public IP addresses are properly rebalanced after
running "ctdb addip" on multiple nodes.  "ctdb reloadips" is a better
solution.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2016-03-10 03:34:19 +01:00
Martin Schwenke
21ec67e2e9 ctdb-doc: Drop outdated NEWS file
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2016-03-10 03:34:19 +01:00
Amitay Isaacs
963d0825fc ctdb-doc: Update ctdb man page
Update ctdb statistics and ctdb dbstatistics output.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:19 +01:00
Amitay Isaacs
2db5e3ed71 ctdb-doc: Update ctdb man page
Do not use obsolete tunables in examples.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:19 +01:00
Amitay Isaacs
bd23b43bfe ctdb-tunables: Fix the implementation of LIST_TUNABLES control
Do not assume the first tunable is not obsolete.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:18 +01:00
Amitay Isaacs
700f39372a ctdb-recovery-helper: Get tunables first, so control timeout can be set
During the recovery process, the timeout value for sending all controls
is decided by RecoverTimeout tunable.  So in the recovery process,
first get the tunables, so the control timeout gets set correctly.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:18 +01:00
Amitay Isaacs
3680aba212 ctdb-doc: Add documentation for missing tunables
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:18 +01:00
Amitay Isaacs
e2539088e0 ctdb-doc: Update tunables documentation
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:18 +01:00
Amitay Isaacs
4a78200f43 ctdb-tunables: Mark tunable ReclockPingPeriod obsolete
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:18 +01:00
Amitay Isaacs
73ab0f9911 ctdb-tunables: Mark tunable MaxRedirectCount obsolete
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:18 +01:00
Amitay Isaacs
aa700deb64 ctdb-tunables: Add missing flags in the initializer
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:18 +01:00
Amitay Isaacs
4bf6cab4a1 ctdb-doc: Sort the tunable variables in alphabetical order
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2016-03-10 03:34:18 +01:00
Martin Schwenke
05051bd115 ctdb-tests: Add a new NFS tickle test for the releasing node
Current NFS and CIFS tickle tests do not test the killtcp
functionality on the releasing node.  2-way killing is done for NFS,
so this test explicitly looks for packets from the releasing node.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2016-03-10 03:34:18 +01:00
Martin Schwenke
6b51ea35fd ctdb-tests: Allow tcptickle_sniff_wait_show() to filter by MAC address
tcpdump does not support filtering on MAC address when reading from a
file.  Therefore, this is implemented by conditionally using grep to
filter the output of tcpdump.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2016-03-10 03:34:18 +01:00
Martin Schwenke
0328098285 ctdb-tests: Re-indent and re-format some functions
This makes the next commit much easier to read.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2016-03-10 03:34:18 +01:00
Martin Schwenke
df2d752e29 ctdb-tests: Fix CIFS tickle test
There's a tiny chance that the connection information may not be
transferred to other nodes quickly enough, so add an explicit wait.
Also clean up the description and recognise that it is the takeover
node that does the tickling.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2016-03-10 03:34:18 +01:00