samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-23 17:34:34 +03:00

Author	SHA1	Message	Date
Martin Schwenke	3f37b4418e	ctdbd: Update confusing log message Inactive can also mean stopped. To add information, just print the flags instead. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a8605f7e06076e7edf84e0cc160fd3d9ab5c4b64)	2013-05-23 16:18:23 +10:00
Michael Adam	ce0916f61b	ltdb_server: use CTDB_REC_RO_FLAGS where appropriate Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 61f17e53576197def46bc61fdf0cdb5282333a3e)	2013-04-24 18:48:47 +10:00
Michael Adam	fd01c464d1	Fix a severe recovery bug that can lead to data corruption for SMB clients. Problem: Recovery can under certain circumstances lead to old record copies resurrecting: Recovery selects the newest record copy purely by RSN. At the end of the recovery, the recovery master is the dmaster for all records in all (non-persistent) databases. And the other nodes locally hold the complete copy of the databases. The bug is that the recovery process does not increment the RSN on the recovery master at the end of the recovery. Now clients acting directly on the Recovery master will directly change a record's content on the recmaster without migration and hence without RSN bump. So a subsequent recovery can not tell that the recmaster's copy is newer than the copies on the other nodes, since their RSN is the same. Hence, if the recmaster is not node 0 (or more precisely not the active node with the lowest node number), the recovery will choose copies from nodes with lower number and stick to these. Here is how to reproduce: - assume we have a cluster with at least 2 nodes - ensure that the recmaster is not node 0 (maybe ensure with "onnode 0 ctdb setrecmasterrole off") say recmaster is node 1 - choose a new database name, say "test1.tdb" (make sure it is not yet attached as persistent) - choose a key name, say "key1" - all clustere nodes should ok and no recovery running - now do the following on node 1: 1. dbwrap_tool test1.tdb store key1 uint32 1 2. dbwrap_tool test1.tdb fetch key1 uint32 ==> 1 3. ctdb recover 4. dbwrap_tool test1.tdb store key1 uint32 2 5. dbwrap_tool test1.tdb fetch key1 uint32 ==> 2 4. ctdb recover 7. dbwrap_tool test1.tdb fetch key1 uint32 ==> 1 ==> BUG This is a very severe bug, since when applied to Samba's locking.tdb database, it means that for SMB clients on clustered Samba there is the potential for locking out oneself from previously opened files or even worse, data corruption: Case 1: locking out - client on recmaster opens file - recovery propagates open file handle (entry in locking.tdb) to other nodes - client closes file - client opens the same file - recovery resurrects old copy of open file record in locking.tdb from lower node - client closes file but fails to delete entry in locking.tdb - client tries to open same file again but fails, since the old record locks it out (since the client is still connected) Case 2: data corruption - clien1 on recmaster opens file - recovery propagates open file info to other nodes - client1 closes the file and disconnects - client2 opens the same file - recovery resurrects old copy of locking.tdb record, where client2 has no entry, but client1 has. - but client2 believes it still has a handle - client3 opens the file and succees without conflicting with client2 (the detached entry for client1 is discarded because the server does not exist any more). => both client2 and client3 believe they have exclusive access to the file and writing creates data corruption Fix: When storing a record on the dmaster, bump its RSN. The ctdb_ltdb_store_server() is the central function for storing a record to a local tdb from the ctdbd server context. So this is also the place where the RSN of the record to be stored should be incremented, when storing on the dmaster. For the case of the record migration, this is currently done in ctdb_become_dmaster() in ctdb_call.c, but there are other places such as in recovery, where we should bump the RSN, but currently don't do it. So moving the RSN incrementation into ctdb_ltdb_store_server fixes the recovery-record-resurrection bug. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit feb1d40b21a160737aead22e398f3c34ff3be8de)	2013-04-17 21:16:17 +10:00
Volker Lendecke	2d1d5d312e	Add a \n to an error message (This used to be ctdb commit 9be3b23adbfc844b71bf1d4ddf0fbc3b269f15fa)	2012-10-25 17:11:15 +11:00
Amitay Isaacs	a00e50e503	ctdbd: Replace lockwait with locking API and remove ctdb_lockwait.c Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2126795153dacb255e441abcb36ee05107b6282a)	2012-10-20 02:48:44 +11:00
Gregor Beck	3fd0b8a5a5	ctdbd: refuse attaching with "persistent" to a non-persistent db and v.v. Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 1ebbaa620b3cfb9ff373828e4aaa84246cf3ec25)	2012-07-03 11:30:04 +02:00
Ronnie Sahlberg	59565c05cf	STATISTICS: Add tracking of the 10 hottest keys per database measured in hopcount and add mechanisms to dump it using the ctdb dbstatistics command (This used to be ctdb commit 8307c70ed98996b430c470e9641a09fdeeb81bd8)	2012-06-13 16:19:18 +10:00
Amitay Isaacs	4392591555	Remove explicit include of lib/tevent/tevent.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 0681014ca5ed2a9b56f63fdace7f894beccf8a9a)	2012-04-13 17:28:14 +10:00
Ronnie Sahlberg	fa3a06246a	STICKY: add prototype code to make records stick to a node to "calm" down if they are found to be very hot and accessed by a lot of clients. This can improve performance and stop clients from having to chase a rapidly migrating/bouncing record (This used to be ctdb commit d0d98f7e45e5084b81335b004d50bddc80cdc219)	2012-03-20 17:12:19 +11:00
Ronnie Sahlberg	cdc232f2dd	READONLY: dont schedule for fast vacuum deletion if any of the readonly record flags are set (This used to be ctdb commit b3307d78fd15f446b423f8cdd1e403f89fbe8ac8)	2012-02-21 06:54:09 +11:00
Ronnie Sahlberg	61762a96e3	ReadOnly: Make sure we dont try to fast-vacuum records that are set for readonly delegation (This used to be ctdb commit 303134cf10a08ce61954d5de9025d9bbcb5f75ef)	2012-02-20 21:13:46 +11:00
Ronnie Sahlberg	73f8be16c6	ReadOnly: add per-database statistics to view how much delegations/revokes we have (This used to be ctdb commit 751ed46197661eb841042ab6a02855a51dd0b17c)	2012-02-08 15:29:27 +11:00
Michael Adam	6f0b22234f	ctdb_ltdb_store_server: when storing a record that is not to be scheduled for deletion, remove it from the delete queue Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit 489148e465e2b8aed87ea836e3518f43490671ca)	2011-12-23 17:39:01 +01:00
Martin Schwenke	f186dd90b6	Move some common functions to common/ctdb_ltdb.c Move identical copies of ctdb_null_func(), ctdb_fetch_func(), ctdb_fetch_with_header_func() from ctdb_client.c and ctdb_ltdb_server.c to somewhere common. This is in the context of wanting to run CCAN-style tests where most of the ctdbd code is just included in the test program. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 126cb0d369b2b1aed63801dc4ba0554399e8b7e4)	2011-11-11 14:31:50 +11:00
Martin Schwenke	3b47e5fa49	Fix typo in ctdb_ltdb_store_server() The if statement uses ret but means to use ret2. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f40101a615f8b9826a484e4697bfea6ee2b9ba88)	2011-11-09 14:55:07 +11:00
Ronnie Sahlberg	0e79b2d1e8	Record Fetch Collapse: Collapse multiple fetch request into one single request. When multiple clients fetch the same record concurrently, send only one single fetch across the network and deferr all other fetches locally. This improves performance for hot records and reduces cpu load on ctdb. (This used to be ctdb commit 82d6946ad8b3348e8b9d3d971f24925ade02d1be)	2011-11-08 16:08:28 +11:00
Ronnie Sahlberg	90c0c235fa	Remove debug message (This used to be ctdb commit db0fdc2281c4742113c92d697371c37815db35a0)	2011-10-24 12:21:55 +11:00
Ronnie Sahlberg	0dc5584101	Merge branch 'master-readonly-records' into foo Conflicts: Makefile.in tools/ctdb.c (This used to be ctdb commit 0fedef0ffba4178126eee9544c5e2db52f5db893)	2011-09-12 09:34:34 +10:00
Michael Adam	a3e0079568	Add a tunable "AllowClientDBAttach" with default value 1. When set to 0, clients will not be able to attach to databases via the db_attach control. This might can be useful for maintenance where ctdb should be kept running but clients should not be able to modify databases. (This used to be ctdb commit ddfeecda87955b4e46777599f678e6926d37f4c4)	2011-09-05 16:17:39 +10:00
Ronnie Sahlberg	206a3c0c66	ReadOnly: add a new control to activate readonly lock capability for a database. let all databases default to not support this until enabled through this control (This used to be ctdb commit 908a07c42e5135a3ba30a625fc4f4e4916de197a)	2011-09-01 11:08:18 +10:00
Ronnie Sahlberg	9729d3e339	ReadOnly: Check the readonly flag instead of whether the tdb pointer is NULL or not (This used to be ctdb commit 01314c2cb3a480917d6a632b83c39f0a48bba0e7)	2011-08-23 10:41:52 +10:00
Ronnie Sahlberg	1441b77cce	ReadOnly: Add "readonly" flag to the ctdb_db_context to indicate if this database supports readonly operations or not. Add a private lock-less tdb file to the ctdb_db_context to use for tracking delegarions for records Assume all databases will support readonly mode for now and se thte flag for all databases. At later stage we will add support to control on a per database level whether delegations will be supported or not. (This used to be ctdb commit 502f86f79944df4bac9094f716e54110c511dc24)	2011-08-23 10:24:26 +10:00
Ronnie Sahlberg	00a870f759	ReadOnly records: Add a new RPC function FETCH_WITH_HEADER. This function differs from the old FETCH in that this function will also fetch the record header and not just the record data (This used to be ctdb commit c7196d16e8e03bb2a64be164d15a7502300eae0e)	2011-08-23 10:06:59 +10:00
Ronnie Sahlberg	cee8c4be94	Deferred attach: create the timed event as a child context of the da context we want to delete. Othwervise the da context can be timed out and talloc_free()d but the event for this already freed object will still trigger, causing a talloc error and shutdown. CQ S1022515 (This used to be ctdb commit 2fd27bdedb1e0d6558c07e1b74fc8e70ddf593dc)	2011-03-16 16:08:45 +11:00
Ronnie Sahlberg	3cc230b5ee	Dont allow clients to connect to databases untile we are well past and through the initial recovery phase CQ S1022412 Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit e02bbd915b7151c615ff64f09ad9abc9720bef7d)	2011-03-14 13:35:53 +01:00
Michael Adam	e77ed68c1a	ctdb_ltdb_store_server: honour the AUTOMATIC record flag Do not delete empty records that carry this flag but store them and schedule them for deletetion. Do not store the flag in the ltdb though, since this is internal only and should not be visible to the client. (This used to be ctdb commit f898ff21fa338358179e79381215b13a6bc77c53)	2011-03-14 13:35:51 +01:00
Michael Adam	6506314c4a	ctdb_ltdb_store_server: add ability to send SCHEDULE_FOR_DELETION control to ctdb_ltdb_store. (This used to be ctdb commit ab2711701999a5ecc23a36b3d9ba8e94f92e4c87)	2011-03-14 13:35:51 +01:00
Michael Adam	1924d0d365	ctdb_ltdb_store_server: Improve debug message in ctdb_ltdb_store when store or delete fails. (This used to be ctdb commit 2559b2a45eb11834da3b0e0963e24351c8b7477f)	2011-03-14 13:35:51 +01:00
Michael Adam	7088e2144f	ctdb_ltdb_store_server: always store the data when ctdb_ltdb_store() is called from the client This also fixes a segfault since ctdb_lmaster uses the vnn_map. (This used to be ctdb commit e58c8f51f27e468897af5210b80e5f5f45c3c4bb)	2011-03-14 13:35:51 +01:00
Michael Adam	6384512eb7	ctdb_ltdb_store_server: implement fastpath vacuuming deletion based on VACUUM_MIGRATED flag. When the record has been obtained by the lmaster as part of the vacuuming-fetch handler and it is empty and never been migrated with data, then such records are deleted instead of being stored. These records have automatically been deleted when leaving the former dmaster, so that they vanish for good when hitting the lmaster in this way. This will reduces the load on traditional vacuuming. Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit c9b65f3602f51bcbf0e6d82c12076c31e4aebe38)	2011-03-14 13:35:51 +01:00
Michael Adam	7602f9a9af	ctdb_ltdb_store_server: delete an empty record that is safe to delete instead of storing locally. When storing a record that is being migrated off to another node and has never been migrated with data, then we can safely delete it from the local tdb instead of storing the record with empty data. Note: This record is not deleted if we are its lmaster or dmaster. Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit 3cca0d4b48325d86de2cb0b44bb7811a30701352)	2011-03-14 13:35:50 +01:00
Michael Adam	9e8d6b82b5	server: Use the ctdb_ltdb_store_server() in the ctdb daemon for non-persistent dbs This is realized by adding a ctdb_ltdb_store_fn function pointer to the db context and filling it in the attach procedure for non-persistent dbs. (This used to be ctdb commit df49ec44de80affa5ccc637dec12a20a26e8706e)	2011-03-14 13:35:50 +01:00
Michael Adam	7948be380c	server: create a server variant ctdb_ltdb_store_server() of ctdb_ltdb_store(). This is supposed to contain logic for deleting records that are safe to delete and scheduling records for deletion. It will be called in server context for non-persistent databases instead of the standard ctdb_ltdb_store() function. (This used to be ctdb commit 23631ffc152486aed9ce5b69a391e52bc4947833)	2011-03-14 13:35:50 +01:00
Michael Adam	b9c9b989ce	When attaching to a non-persistent DB, initialize the delete_queue. (This used to be ctdb commit 0aff1b61dd1b683c6739478008a5b014b933df50)	2011-03-14 13:35:45 +01:00
Ronnie Sahlberg	b611de93ad	ATTACH_DB: simplify the code slightly and change the semantics to only refuse a db attach during recovery IF we can associate the request from a genuine real client instead of deciding this on whether client_id is zero or This will suppress/avoid messages like these : DB Attach to database %s refused. Can not match clientid... (This used to be ctdb commit b05ccf366df985e0a3365aacc75761ebd438deaf)	2011-03-01 12:13:46 +11:00
Ronnie Sahlberg	8acb677c9c	Deferred attach : at early startup, defer any db attach calls until we are out of recovery. (This used to be ctdb commit eeaabd579841f60ab2c5b004cbbb1f5de2bfe685)	2011-03-01 12:13:34 +11:00
Ronnie Sahlberg	e00ca55fa4	Dont return error if trying to set db priority on a db that does not yet exist. Just treat as a nop. When the database is created later it will get its priority set properly. (This used to be ctdb commit 05c934b10ad2690be9d75c9033a0b849bf16455d)	2011-02-25 10:25:01 +11:00
Ronnie Sahlberg	8eb5cfc053	Add support to create TDB databases using the new jenkins hash. SRVID for the control to attach to a database is used to pass tdb flags from samba to ctdb when samba attached to a database. This has been used earlier for TDB_NOSYNC flag. Add TDB_INCOMPATIBLE_HASH as a supported tdb flag to store in the SRVID field when attaching to a database. This allows samba to control if ctdb should create databases using the new jenkins hash, or using the old hash. This only affects new databases when they are initially created. Existing databases remain using the old hash when attached to. (This used to be ctdb commit e0eda175ac979828b376e8a6779b4608af52eb32)	2010-10-27 08:10:07 +11:00
Rusty Russell	f93440c4b7	event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version `7f29f817fa`. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)	2010-08-18 09:16:31 +09:30
Ronnie Sahlberg	641da4c691	We can not be holding a chainlock at this stage, so the tdb_chainunlock() call is bogus ( a child process might be holding the lock, but not the main daemon) (This used to be ctdb commit 9b4a83e49c5df80df8498b7384c5f53f390c1d9d)	2010-06-09 15:13:22 +10:00
Ronnie Sahlberg	fa618aa66a	add additional logging when tdb_chainunlock() fails so we can see where it was called from when it fails (This used to be ctdb commit 0c091b3db6bdefd371787d87bc749593ea8e3c76)	2010-06-09 14:37:16 +10:00
Ronnie Sahlberg	4c722fe34c	fix a conflict in the merge from rusty Merge commit 'rusty/ctdb-no-setsched' Conflicts: server/ctdb_vacuum.c (This used to be ctdb commit b4365045797f520a7914afdb69ebd1a8dacfa0d9)	2009-12-17 08:18:04 +11:00
Rusty Russell	af2613e16f	ctdb: use mlockall, cautiously We don't want ctdb stalling due to paging; this can be far worse than scheduling delays. But if we simply do mlockall(MCL_FUTURE), it increases the risk that mmap (ie. tdb open) or malloc will fail, causing us to abort. This patch is a compromise: we mlock all current pages (including 10k of future stack for expansion) and then relock when a client asks us to open a TDB. We warn, but don't exit, if it fails. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 82f778e85440bc713d3f87c08ddc955d3cfce926)	2009-12-16 20:57:20 +10:30
Rusty Russell	f148735928	Add --valgringing flag instead of --nosetsched The do_setsched was being tested for whether to mmap tdbs: let's make it explicit. We can also happily move the kill-child eventscript hack under this flag. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 2ee86cc1f311d7b7504c7b14d142b9c4f6f4b469)	2009-12-16 20:59:15 +10:30
Stefan Metzmacher	f1f0af2b67	server: add CTDB_CONTROL_DB_SET_HEALTHY and CTDB_CONTROL_DB_GET_HEALTH metze (This used to be ctdb commit 7332d900538f0cbcd953a723417a0fe31dc9807c)	2009-12-16 08:08:29 +01:00
Stefan Metzmacher	94bc40307a	server: Use tdb_check to verify persistent tdbs on startup Depending on --max-persistent-check-errors we allow ctdb to start with unhealthy persistent databases. The default is 0 which means to reject a startup with unhealthy dbs. The health of the persistent databases is checked after each recovery. Node monitoring and the "startup" is deferred until all persistent databases are healthy. Databases can become healthy automaticly by a completely HEALTHY node joining the cluster. Or by an administrator with "ctdb backupdb/restoredb" or "ctdb wipedb". metze (This used to be ctdb commit 15f133d5150ed1badb4fef7d644f10cd08a25cb5)	2009-12-16 08:06:10 +01:00
Stefan Metzmacher	b74918b465	server: open /var/ctdb/state/persistent_health.tdb.X on startup This node internal tdb will store the HEALTH state of persistent tdbs. metze (This used to be ctdb commit cbda4666be88c11a810a192a70667b57f773ace1)	2009-12-16 08:03:56 +01:00
Stefan Metzmacher	9a96ae0c97	server: only do the mkdir() calls for db_directory* once at the start metze (This used to be ctdb commit f30f33685db50860b6cd6fd1b6bdc3066620a78f)	2009-12-16 08:03:56 +01:00
Stefan Metzmacher	cda5884854	server: create tdbs with 0600 permissions in ctdb_local_attach() metze (This used to be ctdb commit 6529a1328b9ec304ad306674651b2a67e4426e23)	2009-12-16 08:03:55 +01:00
Stefan Metzmacher	003985acfd	ctdb: pass TDB_DISALLOW_NESTING to all tdb_open/tdb_wrap_open calls metze Signed-off-by: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit 1635e931b909c66eb3b1f5357e3a549b1a0da70d)	2009-12-16 08:03:55 +01:00

1 2

80 Commits