1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-15 23:24:37 +03:00

811 Commits

Author SHA1 Message Date
Ronnie Sahlberg
26322d257d DEBUG: Add checks for and print debug messages when 1) a database contains very many records, 2) when a database is very big, 3) when a single record is very big.
Add tunables to control when to log these instances and allow it to be completely turned off by setting the threshold to 0

(This used to be ctdb commit 9ed58fef4991725f75509433496f4d5ffae0ae87)
2012-05-21 13:26:13 +10:00
Ronnie Sahlberg
dce5969d12 Debug: When scripts hang, we may need to collect additional data in order to debug why the script hung.
Break this debug and datacollection out into an external script to make it easier to modify what data we need to collect.
For now we only collect a pstree so we can see what part of the script we hung in.

S1037271

(This used to be ctdb commit 6e68797af67bee36f2bad045f94806e7e98f27e9)
2012-05-17 10:29:03 +10:00
Ronnie Sahlberg
a57eba2bb4 Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process
Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned.
Capture SIGCHLD to track also which child processes have terminated.

Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a

(This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)
2012-05-03 14:03:26 +10:00
Ronnie Sahlberg
a367fa6138 RELOADIPS: simplify the reloadips code a bit
and also update the "read public address file" to not check if the address exists already locally when we read if from the child process, to stop it
from spamming the logs with "We already host ..."
messages

(This used to be ctdb commit 334ea830f1bf33419f4a1e78f23afd41a852d0f4)
2012-05-01 15:34:26 +10:00
Ronnie Sahlberg
7a1aa560e7 Add new control to reload the public ip address file on a node
Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster.
Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy.

(This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0)
2012-05-01 10:48:08 +10:00
Amitay Isaacs
131d35d67d includes: Move special tevent defines from tevent.h to includes.h
This allows to build against system tevent library. Also include tevent header
along with other common headers.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 9ae4389c2c959c5dcd8395fdae2b25ed7e1e873a)
2012-04-13 17:28:14 +10:00
Martin Schwenke
fbe64dec01 Undo damage done by d8d37493478a26c5f1809a5f3df89ffd6e149281
The implementation of DisableIPFailover got intermingled with
--nopublicipcheck.  This just looks wrong - Ronnie must have been
having a bad day.  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 5083b266dd68b292c4275505f3d1b878dbf12f11)
2012-03-22 15:34:52 +11:00
Ronnie Sahlberg
2456f77ca6 NoIPTakeover: change the tunable name for the "dont allow failing addresses over onto the node" to NoIPTakeover
(This used to be ctdb commit 35592e618cfd827b6978af6332f80504f232c46a)
2012-03-22 11:05:15 +11:00
Ronnie Sahlberg
befa9df152 Make NoIPFailback a node local setting. Nodes that have NoIPFailback set to !0 can not takeover new ip addresses during failover.
Remove the old global setting for this unused tunable and add it as a new node flag. This node flag is only valid/defined within the takeover subsystem in the recovery daemon. Add async functions to collec the NoIPFailback settings for each node.

This will later e used to disqualify certain nodes from being takeover targets when we perform reallocation.

(This used to be ctdb commit 668f3e88a9e5f598706952b7140547640c85a5ed)
2012-03-22 09:09:57 +11:00
Ronnie Sahlberg
fa3a06246a STICKY: add prototype code to make records stick to a node to "calm" down if they are found to be very hot and accessed by a lot of clients.
This can improve performance and stop clients from having to chase a rapidly migrating/bouncing record

(This used to be ctdb commit d0d98f7e45e5084b81335b004d50bddc80cdc219)
2012-03-20 17:12:19 +11:00
Ronnie Sahlberg
e7e51ddb64 LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node
This can improve performance slightly on certain workloads where smbds frequently read from the same record

(This used to be ctdb commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504)
2012-03-20 12:26:22 +11:00
Ronnie Sahlberg
6a493a0b08 STATISTICS: add per-db hop count statistics
(This used to be ctdb commit 1c976d83b1d7dac6f0ef81306774998e4c8b56a1)
2012-03-20 12:11:55 +11:00
Ronnie Sahlberg
c051f67d67 FETCH COLLAPSE : Change the fetch-lock collapse to collapse ALL fetches, including fetch-locks into a single command in flight per record. Also add a tunable to enable/disable this optimization for hot records
(This used to be ctdb commit eafd7bbaaa5931546a96c8beae3cf9a39a49c925)
2012-03-20 11:39:00 +11:00
Ronnie Sahlberg
038c946e80 add max hop count buckets to see how bad hopcounts are
(This used to be ctdb commit 7d3931298e6477d92f43652c3006b0c426cb1307)
2012-03-20 11:20:53 +11:00
Ronnie Sahlberg
f3600276fc Add a tunable variable to control how long we defer after a ctdb addip until we force a rebalance and try to failback addresses onto this node
Have it default to 300 seconds.

(This used to be ctdb commit 49791db7dc74cffd7e88bd73091590cdc1909328)
2012-02-28 06:58:59 +11:00
Ronnie Sahlberg
ef2bd0b016 When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance.
(This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2)
2012-02-28 06:56:04 +11:00
Ronnie Sahlberg
93ec9c589c Eventscripts: remove the horrible horrible circular reference between state and callback since these two structures do not even share the same parent talloc context.
Instead, tie them together via referencing a permanent linked list hung off the ctdb structure.

(This used to be ctdb commit a95c02da6c67dc4bd8716b75318a4188301df6f9)
2012-02-23 06:49:47 +11:00
Ronnie Sahlberg
42e477b14e READONLY: only send a control to schedule fast-vacuuming from child context iff we have a connection open to the main daemon
there are some child processes where we do not create a connection to the main daemon (switch_from_server_to_client()) because it is expensive to set up and we normally might not need to talk to the daemon at all via a domainsocket.
but we might want to still call to ctdb_ltdb_store() from such chil processes.

(This used to be ctdb commit 9e372a08c40087e6b5335aa298e94d88273566a5)
2012-02-21 07:03:44 +11:00
Ronnie Sahlberg
73f8be16c6 ReadOnly: add per-database statistics to view how much delegations/revokes we have
(This used to be ctdb commit 751ed46197661eb841042ab6a02855a51dd0b17c)
2012-02-08 15:29:27 +11:00
Ronnie Sahlberg
1eafa68f0f STATISTICS: add total counts for number of delegations and number of revokes
Everytime we give a delegation to another node we count this as one delegation.
If the same record is delegated to several nodes we count one for each node.

Everytime a record has all its delegations revoked we count this as one revoke.

(This used to be ctdb commit b098bcf8007be63889aaed640a951b0eeaa9d191)
2012-02-08 13:42:30 +11:00
Martin Schwenke
ed8a8ee966 libctdb - add ctdb_getvnnmap()
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f6039eaece4224b866a98dd49010f278a7b3f015)
2012-02-06 16:00:23 +11:00
Ronnie Sahlberg
e648045499 Merge branch 'master' of ssh://git.samba.org/data/git/ctdb
(This used to be ctdb commit 15d8ae8b0f80f95d7839528b8ac60aa0e2485c77)
2012-01-03 12:40:15 +11:00
Michael Adam
e04fad0ee4 vacuum: add new tunable VacuumInterval and mark Vacuum{Default,Min,Max}Interval obsolete
And use VacuumInterval instead of VacuumDefaultInterval in the code.

(This used to be ctdb commit 78530f40338f511a7cd1d33ada450905742bfa8f)
2011-12-23 17:39:02 +01:00
Michael Adam
a481ca711f vacuum: add ctdb_local_remove_from_delete_queue()
Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit a5065b42a98c709173503e02d217f97792878625)
2011-12-23 17:39:00 +01:00
Martin Schwenke
8b74037633 ctdb tool - generalise nodestring parsing for -n
Centralise -n nodestring parsing and add the ability to pass a
comma-separated list of node numbers.  Listing a node that is
disconnected or deleted results in failure, similar to the way passing
a single node currently works.  All of the auto_all commands inherit
this functionality.  For now, the non-auto_all commands do not inherit
this - they need to be individually tweaked.  Therefore, we haven't
updated the documentation to advertise this feature.

Implemented via a new function parse_nodestring() that parses an
optional (pass NULL when not available to indicate "current node")
comma-separated list of node numbers or "all".  parse_nodestring() can
be told to be non-fatal for disconnected/deleted nodes so it can also
be used in other contexts (yes, coming soon).  main() is changed to
call this function.

A new magic PNN value CTDB_MULTICAST is added and along with a
corresponding option.nodes structure member (a talloc-ed array of
PNNs).  This is also populated for "all" as well.

control_status() has new function pretty_print_flags() factored out so
pretty-printed flags can be used in error/debug messages.  New
function is_partially_online() is also factored out - this simplifies
some of the logic.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 920e3a732eb9e09004edde6cfb3c7db8a004016f)
2011-12-08 17:00:17 +11:00
Ronnie Sahlberg
609149bdc8 LibCTDB: Add support for the 'get interfaces' control and update the ctdb tool to use this interface
(This used to be ctdb commit 77dc0c7351071243d9096d3607d7499c82f46ec0)
2011-12-06 13:12:18 +11:00
Mathieu Parent
bb3d6698e9 Move platform-specific code to common/system_*
This removes #ifdef AIX and ease the addition of new platforms.

(This used to be ctdb commit 2fd1067a075fe0e4b2a36d4ea18af139d03f17bf)
2011-12-06 11:57:11 +11:00
Michael Adam
ad0de5494e traverse: fix traversing with empty records by adding a new (internal) control CTDB_CONTROL_TRAVERSE_START_EXT
By this, the original CTDB_CONTROL_TRAVERSE_START control that is
used by e.g. samba's smbstatus, is not changed, so that samba
continues working without code change.

The  CTDB_CONTROL_TRAVERSE_START currently just adds the "withemptyrecords"
flag to the state and processon on as CTDB_CONTROL_TRAVERSE_START_EXT.

(This used to be ctdb commit 8281bb210858ed04992eacea7f6d02261e0fc1b1)
2011-12-03 02:15:30 +01:00
Ronnie Sahlberg
11f3c947e6 LibCTDB: add support for the check-srvids control
(This used to be ctdb commit c32604fd0016de0df14845a2f222edaa3c52a4fa)
2011-11-30 10:00:07 +11:00
Volker Lendecke
5a1da0ac55 Add CTDB_CONTROL_CHECK_SRVID
(This used to be ctdb commit ad64ef2c40a2a12b37dbf39142e95c6781c2fc3b)
2011-11-30 09:02:26 +11:00
Ronnie Sahlberg
0420449a6c Recover Persistent database DB by DB and not record by record
Add a new tunable that changes the mode how persistent databases are recovered.
RecoveryPDBBySeqNum

When set to 1, persistent databases will be recovered in whole from the node which
has the highest "__db_sequence_number__" record.
This record is managed by samba for those databases where we do persistent writes and have
inter-record relations.
For these databases we do not want the usual "blend records from all nodes based
on individual record RSN" but instead a mode where we pick one instance of the persistent database.

If no node was found with a "__db_sequence_number__" record at all, we fail back to the original "recover records independently based on record RSN".
Some persistent databases do not contain record interrelations and as such does not
contain this special record at all.

(This used to be ctdb commit 502150c764298a9fa8c4d8aa445bf7d85d4ee9dc)
2011-11-30 08:48:23 +11:00
Ronnie Sahlberg
3cbff2edd8 LibCTDB: add get persistent db seqnum control
(This used to be ctdb commit 6e96a62494bbb2c7b0682ebf0c2115dd2f44f7af)
2011-11-30 08:48:14 +11:00
Michael Adam
31d62794fe ctdb: add an option --print-recordflags to trigger printing record flags in catdb and dumpdbbackup
This changes the default behaviour to not print record flags.

(This used to be ctdb commit 2d2ce07c51055d9400b22cd3c1fd682597cb921c)
2011-11-29 13:43:35 +01:00
Michael Adam
e6923904e8 ctdb: add an option --print-hash to enable printing of record hashes when dumping dbs
(This used to be ctdb commit efc033c28ade97f9884794256d59a4553e052d5f)
2011-11-29 13:43:34 +01:00
Michael Adam
86cd78efee ctdb: add an option --print-lmaster to enable printing of lmaster in "ctdb catdb"
(This used to be ctdb commit 326f88ef622620cb9e0569c4497bc0e86124beaa)
2011-11-29 13:43:33 +01:00
Michael Adam
dc98c12ac9 ctdb: add an option --print-datasize to only print datasize instead of dumping data in db dumps
Used in catdb, cattdb and dumpdbbackup.

(This used to be ctdb commit dd866116041e71cbf91e7fd91edcc9501634051d)
2011-11-29 13:43:32 +01:00
Michael Adam
1fcc7651f4 ctdb: add an option --print-emptyrecords to enable printing of empty records in dumping databases
this option is used with the commands catdb, cattdb and dumpdbbackup.

(This used to be ctdb commit 6ec68a2e667f66d2b194fe48cb75229a2777842e)
2011-11-29 10:30:24 +01:00
Michael Adam
1a31c84348 traverse: add a flag to enable transferring empty records in cluster wide traverse
This will be useful for also printing information about empty/deleted
records in "ctdb catdb", e.g. for debugging vacuuming issues.

(This used to be ctdb commit ddc5da3a0df7701934404192a0a0aa659a806acb)
2011-11-29 10:30:24 +01:00
Martin Schwenke
3ae8273d86 Make some ctdb_takeover.c functions static
These were intentionally not static so they could be linked to in unit
test programs.  However, using the CCAN-style unit tests where
relevant code is just included, this is no longer necessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d0e9e8554614bd49ffb9ec3509feaa0e80d0f65d)
2011-11-11 14:41:47 +11:00
Martin Schwenke
f186dd90b6 Move some common functions to common/ctdb_ltdb.c
Move identical copies of ctdb_null_func(), ctdb_fetch_func(),
ctdb_fetch_with_header_func() from ctdb_client.c and
ctdb_ltdb_server.c to somewhere common.

This is in the context of wanting to run CCAN-style tests where most
of the ctdbd code is just included in the test program.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 126cb0d369b2b1aed63801dc4ba0554399e8b7e4)
2011-11-11 14:31:50 +11:00
Martin Schwenke
52ff485958 Added some #ifndefs to stop files being included multiple times.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit fdca12c25e6fce6206135b994dedf44265e4eb09)
2011-11-11 14:31:50 +11:00
Ronnie Sahlberg
44de394796 SRVID ranges: Change the ranges for SRVIDs to allow 8 bit prefixes
Update the ranges used for SRVID allocation to allow 8 bit prefixes and thus
56 user-defined bits.
Define the defacto-use of the 0x00 prefix as a SRVID used to register a process id
Upgrade SAMBA/iSCSI/NFS/TEST from a 32 bit prefix each ot a 8 bit prefix each
for private use.

(This used to be ctdb commit 5de9ec2bdf8067406165bc470becdca87f458ae9)
2011-11-09 08:12:44 +11:00
Ronnie Sahlberg
0e79b2d1e8 Record Fetch Collapse: Collapse multiple fetch request into one single request.
When multiple clients fetch the same record concurrently, send only one single
fetch across the network and deferr all other fetches locally.
This improves performance for hot records and reduces cpu load on ctdb.

(This used to be ctdb commit 82d6946ad8b3348e8b9d3d971f24925ade02d1be)
2011-11-08 16:08:28 +11:00
Ronnie Sahlberg
c21ec9fffc ReadOnly: add readonly record lock requests to libctdb
Initial readonly record support in libctdb.
New records are not yet created by the library but extising records will be delegated as readonly records.
This needs a bit more tests before we can drop the "old style" implementation of client
code in client/ctdb_client.c

(This used to be ctdb commit fb50a45a21ff56480d76acd1c33c13c323cbf5e2)
2011-10-28 11:55:46 +11:00
Ronnie Sahlberg
8e4bfba75c ReadOnly: Rename the function ctdb_ltdb_fetch_readonly() to ctdb_ltdb_fetch_with_header() since this is what it actually does.
(This used to be ctdb commit 94a5ce4e08e7891f07dbfe4c822ca4be5ab10965)
2011-09-13 18:38:20 +10:00
Ronnie Sahlberg
0dc5584101 Merge branch 'master-readonly-records' into foo
Conflicts:

	Makefile.in
	tools/ctdb.c

(This used to be ctdb commit 0fedef0ffba4178126eee9544c5e2db52f5db893)
2011-09-12 09:34:34 +10:00
Ronnie Sahlberg
1c05db2c9c Merge remote branch 'ddiss/master_pmda_and_client_timeouts'
(This used to be ctdb commit 7bebfc7bad8f36e54003b8e25372fdaf54836e21)
2011-09-08 11:22:53 +10:00
David Disseldorp
2f925f1e64 pmda: Attempt reconnects while ctdbd is unavailable
Attempt to reconnect to ctdbd on fetch while it is unreachable.

We must provide our own queue callback wrapper, as ctdb_client_read_cb()
exits on transport failure.

(This used to be ctdb commit 28df6fbf1273b8d095a2bc38dca6a6c35c5c31bd)
2011-09-06 14:01:18 +02:00
David Disseldorp
5296da5609 client: add timeout argument to ctdb_attach
Rather than using a fixed 2 second CTDB_CONTROL_GETDBPATH timeout.

(This used to be ctdb commit 9e178671560cb95121e11d718a76b05380ecd6c5)
2011-09-06 13:57:04 +02:00
David Disseldorp
0628d1c0e6 client: add req timeout argument to ctdb_cmdline_client
Following connection to the local ctdbd, ctdb_cmdline_client() currently
issues a CTDB_CONTROL_GET_PNN request with a fixed 3 second timeout.

The ctdb cmd line client accepts a --timelimit argument for specifying
a per request timeout, pass this value through to ctdb_cmdline_client()
for use as a CTDB_CONTROL_GET_PNN request timeout.

(This used to be ctdb commit 0634d0305f42f17048b6830733767e8dc300e11c)
2011-09-06 13:56:54 +02:00