samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-23 17:34:34 +03:00

Author	SHA1	Message	Date
Michael Adam	56f9231c8e	tdb: use tdb_freelist_merge_adjacent in tdb_freelist_size() So that we automatically defragment the free list when freelist_size is called (unless the database is read only). Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	843a8a5c7b	tdb: add tdb_freelist_merge_adjacent() This is intended to be called to reduce the fragmentation in the freelist. This is to make up the deficiency of the freelist to be not doubly linked. If the freelist were doubly linked, we could easily avoid the creation of adjacent freelist entries. But with the current singly linked list, it is only possible to cheaply merge a new free record into a freelist entry on the left, not on the right... This can be called periodically, e.g. in the vacuuming process of a ctdb cluster. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	73c439f581	tdb: add utility function check_merge_ptr_with_left_record() Variant of check_merge_with_left_record() that reads the record itself if necessary. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	4bec28bfa9	tdb: simplify tdb_free() using check_merge_with_left_record() Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	117807cd2d	tdb: add utility function check_merge_with_left_record() Check whether the record left of a given freelist record is also a freelist record, and if so, merge the two records. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	66f3330be8	tdb: improve comments for tdb_free(). Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	8be5c8a6db	tdb: factor merge_with_left_record() out of tdb_free() Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	63673aea9f	tdb: fix debug message in tdb_free() Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	08a76aabe9	tdb: reduce indentation in tdb_free() for merging left Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	87ac4ac523	tdb: increase readability of read_record_on_left() by using early returns and better variable names, and reducing indentation. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Michael Adam	f5a777a36c	tdb: factor read_record_on_left() out of tdb_free() Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2014-06-26 10:00:11 +02:00
Volker Lendecke	db5bda56bf	tdb: add TDB_MUTEX_LOCKING support This adds optional support for locking based on shared robust mutexes. The caller can use the TDB_MUTEX_LOCKING flag together with TDB_CLEAR_IF_FIRST after verifying with tdb_runtime_check_for_robust_mutexes() that it's supported by the current system. The caller should be aware that using TDB_MUTEX_LOCKING implies some limitations, e.g. it's not possible to have multiple read chainlocks on a given hash chain from multiple processes. Note: that this doesn't make tdb thread safe! Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> Pair-Programmed-With: Michael Adam <obnox@samba.org> Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2014-05-22 21:05:15 +02:00
Volker Lendecke	cbd73ba163	tdb: introduce tdb->hdr_ofs This makes it possible to have some extra headers before the real tdb content starts in the file. This will be used used e.g. to implement locking based on robust mutexes. Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> Pair-Programmed-With: Michael Adam <obnox@samba.org> Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2014-05-22 21:05:15 +02:00
Stefan Metzmacher	c29e64d97e	tdb: introduce TDB_SUPPORTED_FEATURE_FLAGS This will allow to store a feature mask in the tdb header on disk, so that openers can check if they can handle the features other openers are using. Pair-Programmed-With: Volker Lendecke <vl@samba.org> Pair-Programmed-With: Michael Adam <obnox@samba.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2014-05-22 21:05:15 +02:00
Stefan Metzmacher	c0b0648555	tdb: use asprintf() to simplify tdb_summary() Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2014-05-22 21:05:15 +02:00
Stefan Metzmacher	e77cbe252f	tdb: return ENOSYS if the tdb was created with spinlocks. Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Stefan Metzmacher <metze@samba.org> Autobuild-Date(master): Mon May 12 21:07:04 CEST 2014 on sn-devel-104	2014-05-12 21:07:04 +02:00
Michael Adam	d9566085c6	tdb: consolidate tdb allocation code - re-use dead records at hash top. When in tdb_store we re-use a dead record reactivated from the target hash chain itself, we currently leave it in its place in the chain. When we re-use a dead record from a different chain or from the freelist instead, we insert it at the beginning of the target chain. This patch changes the behaviour to always newly store a record at the beginning of the hash chain. This removes a special case and hence simplifies the allocation code. On the other hand side, it introduces two additioal tdb_ofs_write calls for the in-chain-case. Note the subtelty of the patch that by moving the case of the candidate record's chain as new case "i=0" into the for loop, we also reverse the order of the two steps in the for-loop body (non blocking freelist alloc and searching for dead record in a chain) in order to keep the overall order of execution identical. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Apr 9 10:37:08 CEST 2014 on sn-devel-104	2014-04-09 10:37:08 +02:00
Stefan Metzmacher	80dff80ee9	tdb: don't alter errno on success of tdb_open_ex() Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2014-04-02 09:03:42 +02:00
Volker Lendecke	3034a5a62b	tdb: Reduce freelist contention In a metadata-intensive benchmark we have seen the locking.tdb freelist to be one of the central contention points. This patch removes most of the contention on the freelist. Ages ago we already reduced freelist contention by using the even much older DEAD records: If TDB_VOLATILE is set, don't directly put deleted records on the freelist, but just mark a few of them just as DEAD. The next new record can them re-use that space without consulting the freelist. This patch builds upon the DEAD records: If we need space and the freelist is busy, instead of doing a blocking wait on the freelist, start looking into other chains for DEAD records and steal them from there. This way every hash chain becomes a small freelist. Just wander around the hash chains as long as the freelist is still busy. With this patch and the tdb mutex patch (following hopefully some time soon) you can see a heavily busy clustered smbd run without locking.tdb futex syscalls. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2014-03-18 13:42:10 +01:00
Volker Lendecke	1461362e93	tdb: Make "tdb_purge_dead" internally public Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2014-03-18 13:42:10 +01:00
Volker Lendecke	92ce9fd9af	tdb: Make "tdb_find_dead" internally public Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2014-03-18 13:42:10 +01:00
Volker Lendecke	4ca018692f	tdb: Add "last_ptr" to tdb_find_dead Will be used soon to unlink a dead record from a chain Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2014-03-18 13:42:10 +01:00
Volker Lendecke	cb09d7937c	tdb: Move adding tailer space to tdb_find_dead This aligns the tdb_find_dead API with the tdb_allocate API and thus makes it a bit easier to understand, at least for me. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2014-03-18 13:42:10 +01:00
Volker Lendecke	255edd1b41	tdb: Do a best fit search for dead records Hash chains are (or can be made) short enough that a full search for the best-fitting dead record is feasible. The freelist can become much longer, there we don't do the full search but accept records which are too large. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2014-03-18 13:42:10 +01:00
Volker Lendecke	d1ce0110f0	tdb: Don't purge records to a blocked freelist If the freelist is heavily contended, we should avoid accessing it Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2014-03-18 13:42:10 +01:00
Volker Lendecke	5f7b481349	tdb: Fix a tdb corruption tdb_purge_dead can change the next pointer of "rec" if we purge the record right behind the current record to be deleted. Just overwrite the magic, not the whole record with stale data. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2014-03-18 13:42:10 +01:00
Michael Adam	001b9582cc	tdb: always open internal databases with incompatible hash. This makes them more efficient due to better distribution of keys across hash chains. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Sat Feb 15 08:26:07 CET 2014 on sn-devel-104	2014-02-15 08:26:06 +01:00
Michael Adam	41b7acacb3	tdb: in tdb_delete_hash, make lock/unlock bracket more obvious by using the same variable as hash as in the lock. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Sat Feb 15 03:21:07 CET 2014 on sn-devel-104	2014-02-15 03:21:07 +01:00
Michael Adam	cde8e290c9	tdb: simplify tdb_delete_hash() a bit Make the lock/unlock bracket more obvious by extracting locking (and finding) from the special cases to the top of the function. This also lets us take lock and find the record outside the special case branches (use dead records or not). There is a small semantic change implied: In the dead records case, the record to delete is looked up before the current dead records are potentially purged. Hence, if the record to delete is not found, the dead records are also not purge. This does not make a big difference though, because purging is only delayed until directly befor the next record to delete is in fact found. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2014-02-14 15:55:46 -08:00
Michael Adam	adb2cd1eee	tdb: tdbtool: dump record magic with fixed number of 8 hex digits Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2014-02-14 15:53:25 -08:00
Michael Adam	057adfae47	tdb: tdbtool: dump record hash with fixed number of 8 hex digits Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2014-02-14 15:53:25 -08:00
Volker Lendecke	f3556bd03b	tdb: Avoid reallocs for lockrecs In normal operations we have at most 3 entries in this array. Don't bother with shrinking. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Stefan Metzmacher <metze@samba.org> Autobuild-Date(master): Sat Dec 14 13:19:47 CET 2013 on sn-devel-104	2013-12-14 13:19:47 +01:00
Christian Ambach	6d88bfcab4	lib/tdb: fix compiler warnings about a variable shadowing a global declaration Signed-off-by: Christian Ambach <ambi@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2013-12-12 14:21:27 -08:00
Volker Lendecke	1f269fcc6e	tdb: Add another overflow check to tdb_expand_adjust Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Mon Jun 3 14:08:54 CEST 2013 on sn-devel-104	2013-06-03 14:08:53 +02:00
Volker Lendecke	d9b4f19e73	tdb: Make tdb_recovery_allocate overflow-safe Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>	2013-06-03 10:21:32 +02:00
Volker Lendecke	8b215df445	tdb: Make tdb_recovery_size overflow-safe Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>	2013-06-03 10:21:31 +02:00
Stefan Metzmacher	7ae09a9695	tdb: add proper OOM/ENOSPC handling to tdb_expand() Failing to do so will result in corrupt tdbs: We will overwrite the hash chain pointers with 0x42424242. Pair-Programmed-With: Volker Lendecke <vl@samba.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>	2013-06-03 10:21:30 +02:00
Stefan Metzmacher	854c5f0aac	tdb: add overflow detection to tdb_expand_adjust() We round up at maximun to a new size of 4GB, but still return at least the given size. The caller has to deal with ENOSPC itself. Pair-Programmed-With: Volker Lendecke <vl@samba.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>	2013-06-03 10:21:28 +02:00
Stefan Metzmacher	e19d46f7e3	tdb: add overflow/ENOSPC handling to tdb_expand_file() Pair-Programmed-With: Volker Lendecke <vl@samba.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>	2013-06-03 10:21:27 +02:00
Stefan Metzmacher	a07ba17e0c	tdb: add a 'new_size' helper variable to tdb_expand_file() Pair-Programmed-With: Volker Lendecke <vl@samba.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>	2013-06-03 10:21:22 +02:00
Volker Lendecke	4483bf143d	tdb: Add overflow-checking tdb_add_off_t Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>	2013-06-03 10:21:20 +02:00
Rusty Russell	3bd686c5ad	tdb: fix logging of offets and lengths. We can have offsets > 2G, so use unsigned values. Fixes other prints to be native types rather than casts, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Tue May 28 11:22:14 CEST 2013 on sn-devel-104	2013-05-28 11:22:14 +02:00
Christian Ambach	11f467d0bd	tdb: include information about hash function being used in tdbtool info output makes it possible to easily determine if the tdb under examination uses jenkins hash or not Signed-off-by: Christian Ambach <ambi@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org>	2013-05-14 14:34:20 +02:00
Volker Lendecke	a92c08e18b	tdb: Little format change Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2013-03-26 10:11:47 +01:00
Volker Lendecke	68698b4e64	tdb: Slightly simplify tdb_expand_file The "else" keywords are not necessary here, we return in the preceding if clause Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Stefan Metzmacher <metze@samba.org> Autobuild-Date(master): Tue Mar 5 14:00:47 CET 2013 on sn-devel-104	2013-03-05 14:00:47 +01:00
Volker Lendecke	a7fdd4f7c2	tdb: Slightly simplify transaction_write realloc(NULL, ...) is equivalent to malloc. We are already using this realloc property for tdb->lockrecs. It should not make any difference in speed, it just makes for a little simpler code. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Stefan Metzmacher <metze@samba.org> Autobuild-Date(master): Tue Feb 19 17:30:13 CET 2013 on sn-devel-104	2013-02-19 17:30:13 +01:00
Volker Lendecke	fcb345f5d6	tdb: Make tdb_release_transaction_locks use tdb_allrecord_unlock The transaction code uses tdb_alrecord_lock/upgrade, so it should also use the tdb_allrecord_unlock function just for symmetry reasons Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2013-02-19 15:46:45 +01:00
Volker Lendecke	3534e4e8d5	tdb: Factor out the retry loop from tdb_allrecord_upgrade For the mutex code we will have to lock the hashchain and the record lock area independently. So we will have to call the loop twice. And, it's a small refactoring for the better anyway I think. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2013-02-19 15:46:45 +01:00
Volker Lendecke	1f93f08364	tdb: Simplify fcntl_lock() a bit All arguments but the cmd are the same. To me this looks a bit better and saves some bytes in the object code. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2013-02-19 15:46:45 +01:00
Volker Lendecke	542400a966	tdb: Use tdb_null in freelistcheck Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2013-02-19 15:46:45 +01:00
Volker Lendecke	05235d5b44	tdb: Fix a typo Signed-off-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Simo Sorce <idra@samba.org> Autobuild-Date(master): Sat Feb 16 17:13:32 CET 2013 on sn-devel-104	2013-02-16 17:13:32 +01:00
Volker Lendecke	72cd5d5ff6	tdb: Remove "header" from tdb_context header.hash_size was the only thing we ever referenced outside of tdb_open_ex and its direct callees. So this shrinks the tdb_context by 164 bytes. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Stefan Metzmacher <metze@samba.org> Autobuild-Date(master): Tue Feb 5 13:18:28 CET 2013 on sn-devel-104	2013-02-05 13:18:28 +01:00
Volker Lendecke	71247ec4bd	tdb: Pass argument "header" to check_header_hash Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2013-02-05 08:55:09 +01:00
Volker Lendecke	1436107b07	tdb: Pass argument "header" to tdb_new_database Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2013-02-05 08:54:28 +01:00
Volker Lendecke	f2d67af7bc	tdb: Fix undefined prototype warnings These functions are deliberately left without prototypes according to `3fdeaa399`, but without prototypes we get warnings. Reviewed-by: Rusty Russell <rusty@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Mon Jan 7 11:20:19 CET 2013 on sn-devel-104	2013-01-07 11:20:19 +01:00
Volker Lendecke	a444bb95a2	tdb: Add a comment explaining the "check" I had to ask git blame to find why we have to do it here... Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Stefan Metzmacher <metze@samba.org> Autobuild-Date(master): Fri Dec 21 13:54:39 CET 2012 on sn-devel-104	2012-12-21 13:54:39 +01:00
Volker Lendecke	3109b541c9	tdb: Make tdb_new_database() follow a more conventional style We usually "goto fail" on every error and then in normal flow set the return variable to success. This patch removes a comment which from my point of view is now obsolete. It violates the {} rule from README.Coding here in favor of the style used in this function. Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:57:01 +01:00
Volker Lendecke	d972e6fa74	tdb: Fix a typo Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:56:47 +01:00
Volker Lendecke	c04de8f3a4	tdb: Fix a typo Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:56:38 +01:00
Volker Lendecke	24755d75b0	tdb: Use tdb_lock_covered_by_allrecord_lock in tdb_unlock Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:56:20 +01:00
Volker Lendecke	f8dafe5685	tdb: Factor out tdb_lock_covered_by_allrecord_lock from tdb_lock_list Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:56:09 +01:00
Volker Lendecke	26b8545df4	tdb: Simplify logic in tdb_lock_list slightly Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:55:55 +01:00
Volker Lendecke	0f4e7a1401	tdb: Slightly simplify tdb_lock_list Avoid an else {} branch when we can do an early return Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:55:15 +01:00
Volker Lendecke	116ec13bb0	tdb: Fix blank line endings Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:54:53 +01:00
Volker Lendecke	7237fdd4dd	tdb: Fix a comment Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:54:47 +01:00
Volker Lendecke	d2b852d79b	tdb: Fix a typo Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:54:40 +01:00
Volker Lendecke	2c3fd8a13e	tdb: Fix a missing CONVERT methods->tdb_write expects data in on-disk format. For reading that record, methods->tdb_read() has taken care of the on-disk to in-memory representation according to the DOCONV() flag passed down. tdb_rec_write() is a wrapper around methods->tdb_write just doing the CONVERT() on the way to disk. Reviewed-by: Rusty Russell <rusty@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2012-12-21 11:54:33 +01:00
Volker Lendecke	c62f8baff8	tdb: Make tdb robust against improper CLEAR_IF_FIRST restart When winbind is restarted, there is a potential crash in tdb. Following situation: We are in a cluster with ctdb. A winbind child hangs in a request to the DC. Cluster monitoring decides the node has a problem. Cluster monitoring decides to kill ctdbd. winbind child still hangs in a RPC request. winbind parent figures that ctdb is dead and immediately commits suicide. winbind parent is restarted by cluster management, overwriting gencache.tdb with CLEAR_IF_FIRST. The CLEAR_IF_FIRST logic as implemented now will not see that a child still has the tdb open, only the parent holds the ACTIVE_LOCK due to performance reasons. During the CLEAR_IF_FIRST logic is done, there is a very small window where we ftruncate(tfd, 0) the file and re-write a proper header without a lock. When during this small window the winbind child comes back, wanting to store something into gencache.tdb, that winbind child will crash with a SIGBUS. Sounds unlikely? See: [2012/09/29 07:02:31.871607, 0] lib/util.c:1183(smb_panic) PANIC (pid 1814517): internal error [2012/09/29 07:02:31.877596, 0] lib/util.c:1287(log_stack_trace) BACKTRACE: 35 stack frames: #0 winbindd(log_stack_trace+0x1a) [0x7feb7d4ca18a] #1 winbindd(smb_panic+0x2b) [0x7feb7d4ca25b] #2 winbindd(+0x1a3cc4) [0x7feb7d4bacc4] #3 /lib64/libc.so.6(+0x32900) [0x7feb7a929900] #4 /lib64/libc.so.6(memcpy+0x35) [0x7feb7a97f355] #5 /usr/lib64/libtdb.so.1(+0x6e76) [0x7feb7b0b0e76] #6 /usr/lib64/libtdb.so.1(+0x3d37) [0x7feb7b0add37] #7 /usr/lib64/libtdb.so.1(+0x863d) [0x7feb7b0b263d] #8 /usr/lib64/libtdb.so.1(+0x8700) [0x7feb7b0b2700] #9 /usr/lib64/libtdb.so.1(+0x2505) [0x7feb7b0ac505] #10 /usr/lib64/libtdb.so.1(+0x25b7) [0x7feb7b0ac5b7] #11 /usr/lib64/libtdb.so.1(tdb_fetch+0x13) [0x7feb7b0ac633] #12 winbindd(gencache_set_data_blob+0x259) [0x7feb7d4d8449] #13 winbindd(gencache_set+0x53) [0x7feb7d4d85b3] #14 winbindd(gencache_del+0x5e) [0x7feb7d4d879e] #15 winbindd(saf_delete+0x93) [0x7feb7d54b693] #16 winbindd(+0xe507e) [0x7feb7d3fc07e] #17 winbindd(+0xe85e5) [0x7feb7d3ff5e5] #18 winbindd(+0xe65be) [0x7feb7d3fd5be] #19 winbindd(+0xe7562) [0x7feb7d3fe562] #20 winbindd(init_dc_connection+0x2e) [0x7feb7d3fe5be] #21 winbindd(+0xe75d9) [0x7feb7d3fe5d9] #22 winbindd(cm_connect_netlogon+0x58) [0x7feb7d3fe658] #23 winbindd(_wbint_PingDc+0x61) [0x7feb7d410991] #24 winbindd(+0x103175) [0x7feb7d41a175] #25 winbindd(winbindd_dual_ndrcmd+0xb7) [0x7feb7d4107d7] #26 winbindd(+0xf8609) [0x7feb7d40f609] #27 winbindd(+0xf9075) [0x7feb7d410075] #28 winbindd(tevent_common_loop_immediate+0xe8) [0x7feb7d4db198] #29 winbindd(run_events_poll+0x3c) [0x7feb7d4d93fc] #30 winbindd(+0x1c2b52) [0x7feb7d4d9b52] #31 winbindd(_tevent_loop_once+0x90) [0x7feb7d4d9f60] #32 winbindd(main+0x7b3) [0x7feb7d3e7aa3] #33 /lib64/libc.so.6(__libc_start_main+0xfd) [0x7feb7a915cdd] #34 winbindd(+0xce2a9) [0x7feb7d3e52a9] This is in a winbind child, logfiles surrounding indicate the parent was restarted. This patch takes all chain locks around the CLEAR_IF_FIRST introduced tdb_new_database.	2012-10-06 13:23:42 +02:00
Rusty Russell	37fd93194d	tdb: Make robust against shrinking tdbs When probing for a size change (eg. just before tdb_expand, tdb_check, tdb_rescue) we call tdb_oob(tdb, tdb->map_size, 1, 1). Unfortunately this does nothing if the tdb has actually shrunk, which as Volker demonstrated, can actually happen if a "longlived" parent crashes. So move the map/update size/remap before the limit check. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-06 13:23:41 +02:00
Rusty Russell	90f463b25f	tdb: add tdb_rescue() This allows for an emergency best-effort dump. It's a little better than strings(1). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-04 09:04:19 +09:30
Volker Lendecke	a168a7c791	tdb: Fix a typo Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Tue Oct 2 19:52:16 CEST 2012 on sn-devel-104	2012-10-02 19:52:16 +02:00
Rusty Russell	1783fe3443	tdb: make TDB_NOSYNC merely disable sync. (As suggested by Stefan Metzmacher, based on the change to ntdb.) Since commit `ec96ea690e`, we handle the case where a process dies during a transaction commit. Unfortunately, TDB_NOSYNC means this no longer works, as it disables the recovery area as well as the actual msync/fsync. We should do everything except the syncs. This also means we can do a complete test with $TDB_NO_FSYNC set; just to get more complete coverage, we disable it explicitly for one test (where we override the actual sync calls anyway). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-06-22 07:35:17 +02:00
Amitay Isaacs	3fdeaa3992	lib/tdb: Add/expose lock functions to support CTDB This patch adds two lock functions used by CTDB to perform asynchronous locking. These functions do not actually perform any fcntl operations, but only increment internal counters. - tdb_transaction_write_lock_mark() - tdb_transaction_write_lock_unmark() It also exposes two internal functions - tdb_lock_nonblock() - tdb_unlock() These functions are NOT exposed in include/tdb.h to prevent any further uses of these functions. If you ever need to use these functions, consider using tdb2. Signed-off-by: Amitay Isaacs <amitay@gmail.com>	2012-03-29 20:07:03 +10:30
Rusty Russell	4442c0b2c9	lib/tdb: fix transaction issue for HAVE_INCOHERENT_MMAP. We unmap the tdb on expand, the remap. But when we have INCOHERENT_MMAP (ie. OpenBSD) and we're inside a transaction, doing the expand can mean we need to read from the database to partially fill a transaction block. This fails, because if mmap is incoherent we never allow accessing the database via read/write. The solution is not to unmap and remap until we've actually written the padding at the end of the file. Reported-by: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Fri Mar 23 02:53:15 CET 2012 on sn-devel-104	2012-03-23 02:53:15 +01:00
Rusty Russell	330e3e1b91	lib/tdb: fix missing return 0 code. `fde694274e` made tdb_mmap return an int, but didn't put the return 0 on the "internal db" case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-03-23 10:41:55 +10:30
Rusty Russell	fde694274e	lib/tdb: fix OpenBSD incoherent mmap. This comment appears in two places in the code (commit `4c6a8273c6` from 2001): /* * We must ensure the file is unmapped before doing this * to ensure consistency with systems like OpenBSD where * writes and mmaps are not consistent. */ But this doesn't help, because if one process is using mmap and another using pwrite, we get incoherent results. As demonstrated by OpenBSD's failure on the tdb unit tests. Rather than disable mmap on OpenBSD, we test for this issue and force mmap to be enabled. This means that we will fail on very large TDBs on 32-bit systems, but it's better than the horrendous performance penalty on every OpenBSD system. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-03-22 01:57:37 +01:00
Rusty Russell	390b9a2dd8	tdb: make tdb_private.h idempotent. The most convenient way to write unit tests in C is to directly #include the C files (CCAN uses this, for example). That works quite well, but it means that tdb_private.h now needs to be protected against multiple inclusions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-02-14 04:04:43 +10:30
Ira Cooper	7b42ceb414	Fix compile when TDB_TRACE is enabled. Autobuild-User: Jeremy Allison <jra@samba.org> Autobuild-Date: Fri Jan 6 04:16:41 CET 2012 on sn-devel-104	2012-01-06 04:16:41 +01:00
Volker Lendecke	c1e9537ed0	tdb: Use tdb_parse_record in tdb_update_hash This avoids a tdb_fetch, thus a malloc/memcpy/free in the tdb_store path	2011-12-25 13:31:58 +01:00
Rusty Russell	5767224b7f	tdb: don't free old recovery area when expanding if already at EOF. We allocate a new recovery area by expanding the file. But if the recovery area is already at the end of file (as shown in at least one client case), we can simply expand the record, rather than freeing it and creating a new one. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Wed Dec 21 06:25:40 CET 2011 on sn-devel-104	2011-12-21 06:25:40 +01:00
Rusty Russell	3a2a755e33	tdb: use same expansion factor logic when expanding for new recovery area. If we're expanding because the current recovery area is too small, we expand only the amount we need. This can quickly lead to exponential growth when we have a slowly-expanding record (hence a slowly-expanding transaction size). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-12-21 14:17:16 +10:30
Volker Lendecke	664add1775	tdb: Avoid a malloc/memcpy in _tdb_store	2011-12-19 15:18:08 +01:00
Rusty Russell	b64494535d	tdb: be more careful on 4G files. I came across a tdb which had wrapped to 4G + 4K, and the contents had been destroyed by processes which thought it only 4k long. Fix this by checking on open, and making tdb_oob() check for wrap itself. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Mon Dec 19 07:52:01 CET 2011 on sn-devel-104	2011-12-19 07:52:01 +01:00
Rusty Russell	ee720fc19c	tdb: increment sequence number in tdb_wipe_all(). TDB2 testing revealed that tdb1 doesn't do this. It's minor, but fix it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Tue Aug 16 10:47:41 CEST 2011 on sn-devel-104	2011-08-16 10:47:41 +02:00
Rusty Russell	4fa51257b2	tdb: enable VALGRIND to remove valgrind noise. Andrew Bartlett complained that valgrind needs --partial-loads-ok=yes otherwise the Jenkins hash makes it complain. My benchmarking here revealed that at least with modern gcc (4.5) and CPU (Intel i5 32 bit) there's no measurable performance penalty for the "correct" code, so rip out the optimized one. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Wed Jun 8 11:05:47 CEST 2011 on sn-devel-104	2011-06-08 11:05:47 +02:00
Rusty Russell	36cfa7b79e	tdb: make sure we skip over recovery area correctly. If it's really the recovery area, we can trust the rec_len field, and don't have to go groping for bitpatterns. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Tue Apr 19 14:15:22 CEST 2011 on sn-devel-104	2011-04-19 14:15:22 +02:00
Simo Sorce	cb884186a5	tdb_expand: limit the expansion with huge records ldb can create huge records when saving indexes. Limit the tdb expansion to avoid consuming a lot of memory for no good reason if the record being saved is huge.	2011-04-18 22:15:11 +09:30
Rusty Russell	094ab60053	tdb: tdb_repack() only when it's worthwhile. tdb_repack() is expensive and consumes memory, so we can spend some effort to see if it's worthwhile. In particular, tdbbackup doesn't need to repack: it started with an empty database! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-04-18 22:15:11 +09:30
Rusty Russell	6aa72dae8f	tdb: fix transaction recovery area for converted tdbs. This is why macros are dangerous; these were converting the pointers, not the things pointed to! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-04-18 22:15:11 +09:30
Volker Lendecke	0080f944b4	tdb: Fix Coverity ID 2238: SECURE_CODING	2011-03-30 09:58:32 +02:00
Volker Lendecke	25397de589	tdb: Fix Coverity ID 2192: NO_EFFECT (ret < 0) can never be true	2011-03-27 22:22:12 +02:00
Volker Lendecke	91cad71390	tdb: Fix a C++ warning Autobuild-User: Volker Lendecke <vlendec@samba.org> Autobuild-Date: Sat Feb 12 19:50:55 CET 2011 on sn-devel-104	2011-02-12 19:50:55 +01:00
Rusty Russell	cac57328a6	tdb: tdb_summary() support. Autobuild-User: Rusty Russell <rusty@rustcorp.com.au> Autobuild-Date: Wed Dec 29 10:12:05 CET 2010 on sn-devel-104	2010-12-29 10:12:05 +01:00
Matthias Dieter Wallnöfer	989d8803f2	tdb:common/open.c - use "discard_const_p" for certain "tdb->name" assignments In order to suppress compiler warnings.	2010-11-27 21:50:42 +01:00
Stefan Metzmacher	dedd064aa8	tdb: set tdb->name early, as it's needed for tdb_name() tdb_name() might be used within the given log function, which might be called from within tdb_open_ex(). metze Autobuild-User: Stefan Metzmacher <metze@samba.org> Autobuild-Date: Fri Nov 12 11:22:21 UTC 2010 on sn-devel-104	2010-11-12 11:22:21 +00:00
Jelmer Vernooij	62c4af9942	tdb: Set _PUBLIC_ in C file rather than header files (Debian bug 600898) Autobuild-User: Jelmer Vernooij <jelmer@samba.org> Autobuild-Date: Thu Oct 21 11:47:22 UTC 2010 on sn-devel-104	2010-10-21 11:47:22 +00:00
Rusty Russell	2dcf76c924	tdb: TDB_INCOMPATIBLE_HASH, to allow safe changing of default hash. This flag to tdb_open/tdb_open_ex effects creation of a new database: 1) Uses the Jenkins lookup3 hash instead of the old gdbm hash if none is specified, 2) Places a non-zero field in header->rwlocks, so older versions of TDB will refuse to open it. This means that the caller (ie Samba) can set this flag to safely change the hash function. Versions of TDB from this one on will either use the correct hash or refuse to open (if a different hash is specified). Older TDB versions will see the nonzero rwlocks field and refuse to open it under any conditions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-09-27 10:48:28 +09:30
Rusty Russell	ccac258d14	tdb: automatically identify Jenkins hash tdbs If the caller to tdb_open_ex() doesn't specify a hash, and tdb_old_hash doesn't match, try tdb_jenkins_hash. This was Metze's idea: it makes life simpler, especially with the upcoming TDB_INCOMPATIBLE_HASH flag. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-09-27 10:48:28 +09:30
Rusty Russell	3258cf3f11	tdb: add Bob Jenkins lookup3 hash as helper hash. This is a better hash than the default: shipping it with tdb makes it easy for callers to use it as the hash by passing it to tdb_open_ex(). This version taken from CCAN and modified, which took it from http://www.burtleburtle.net/bob/c/lookup3.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-09-27 10:48:28 +09:30
Günther Deschner	1585c4df68	lib/tdb: fix c++ build warning in tdb_header_hash(). Guenther	2010-09-20 16:15:11 -07:00
Andrew Tridgell	ff515ff477	tdb: added TDB_NO_FSYNC env variable this might help reduce test times and load on test machines	2010-09-16 21:09:17 +10:00
Rusty Russell	786b726300	tdb: put example hashes into header, so we notice incorrect hash_fn. This is Stefan Metzmacher <metze@samba.org>'s patch with minor changes: 1) Use the TDB_MAGIC constant so both hashes aren't of strings. 2) Check the hash in tdb_check (paranoia, really). 3) Additional check in the (unlikely!) case where both examples hash to 0. 4) Cosmetic changes to var names and complaint message. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-09-13 20:05:59 +09:30
Rusty Russell	f77708e962	tdb: fix tdb_check() on other-endian tdbs. We must not endian-convert the magic string, just the rest. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-09-13 19:59:18 +09:30
Rusty Russell	82e5644c9d	tdb: fix tdb_check() on read-only TDBs to actually work. Commit `bc1c82ea13` "Fix tdb_check() to work with read-only tdb databases." claimed to do this, but tdb_lockall_read() fails on read-only databases. Also make sure we can still do tdb_check() inside a transaction (weird, but we previously allowed it so don't break the API). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-09-13 19:58:23 +09:30
Rusty Russell	9e0deff904	tdb: make check more robust against recovery failures. We can end up with dead areas when we die during transaction commit; tdb_check() fails on such a (valid) database. This is particularly noticable now we no longer truncate on recovery; if the recovery area was at the end of the file we used to remove it that way. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-09-13 19:55:26 +09:30
Rusty Russell	11ab43084b	tdb: workaround starvation problem in locking entire database. We saw tdb_lockall() take 71 seconds under heavy load; this is because Linux (at least) doesn't prevent new small locks being obtained while we're waiting for a big log. The workaround is to do divide and conquer using non-blocking chainlocks: if we get down to a single chain we block. Using a simple test program where children did "hold lock for 100ms, sleep for 1 second" the time to do tdb_lockall() dropped signifiantly. There are ln(hashsize) locks taken in the contended case, but that's slow anyway. More analysis is given in my blog at http://rusty.ozlabs.org/?p=120 This may also help transactions, though in that case it's the initial read lock which uses this gradual locking routine; the update-to-write-lock code is separate and still tries to update in one go. Even though ABI doesn't change, minor version bumped so behavior change can be easily detected. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-14 02:31:22 +09:30
Jeremy Allison	bc1c82ea13	Fix tdb_check() to work with read-only tdb databases. The function tdb_lockall() uses F_WRLCK internally, which doesn't work on a fd opened with O_RDONLY. Use tdb_lockall_read() instead. Jeremy.	2010-07-29 08:56:35 +09:30
Günther Deschner	f7a3bd4fa4	tdb: fix the build on mac os x 10.6.4. Guenther	2010-07-01 23:14:57 +02:00
Günther Deschner	2eab1d7fdc	tdb: remove unused variable in tdb_new_database(). Guenther	2010-05-11 13:41:17 +02:00
Rusty Russell	91e4a1760d	tdb: fix short write logic in tdb_new_database Commit 207a213c/24fed55d purported to fix the problem of signals during tdb_new_database (which could cause a spurious short write, hence a failure). However, the code is wrong: newdb+written is not correct. Fix this by introducing a general tdb_write_all() and using it here and in the tracing code. Cc: Stefan Metzmacher <metze@samba.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-05-05 15:37:18 +09:30
Andrew Tridgell	773a8afbba	tdb: update tdb ABI to use hide_symbols=True We now use -fvisibilty=hidden to hide symbols from outside the tdb shared library. This also moved tdb_transaction_recover() into the tdb_private.h header, as it should never have been a public API. For that reason we are changing the version number. We're only doing a minor version increment as it is extremely unlikely that anyone was actually using tdb_transaction_recover() as its locking requirements were rather unusual. Pair-Programmed-With: Rusty Russell <rusty@samba.org>	2010-04-20 15:50:27 +10:00
Volker Lendecke	261c3b4f1b	tdb: Add a non-blocking version of tdb_transaction_start	2010-03-26 14:27:47 -04:00
Volker Lendecke	59315887a0	tdb: Fix indentation in tdb_new_database()	2010-03-25 10:30:10 +01:00
Volker Lendecke	ea8e0d5d54	Fix some nonempty blank lines	2010-03-25 10:24:45 +01:00
Volker Lendecke	fb98f60594	tdb: If tdb_parse_record does not find a record, return -1 instead of 0	2010-02-28 17:40:59 +01:00
Rusty Russell	ec96ea690e	tdb: handle processes dying during transaction commit. tdb transactions were designed to be robust against the machine powering off, but interestingly were never designed to handle the case where an administrator kill -9's a process during commit. Because recovery is only done on tdb_open, processes with the tdb already mapped will simply use it despite it being corrupt and needing recovery. The solution to this is to check for recovery every time we grab a data lock: we could have gained the lock because a process just died. This has no measurable cost: here is the time for tdbtorture -s 0 -n 1 -l 10000: Before: 2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75 After: 2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 13:23:58 +10:30
Rusty Russell	1bf482b9ef	patch tdb-refactor-tdb_lock-and-tdb_lock_nonblock.patch	2010-02-24 13:18:06 +10:30
Rusty Russell	8c3fda4318	tdb: don't truncate tdb on recovery The current recovery code truncates the tdb file on recovery. This is fine if recovery is only done on first open, but is a really bad idea as we move to allowing recovery on "live" databases. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:50:41 +10:30
Rusty Russell	9f295eecff	tdb: remove lock ops Now the transaction code uses the standard allrecord lock, that stops us from trying to grab any per-record locks anyway. We don't need to have special noop lock ops for transactions. This is a nice simplification: if you see brlock, you know it's really going to grab a lock. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:49:22 +10:30
Rusty Russell	a84222bbaf	tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks() tdb_release_extra_locks() is too general: it carefully skips over the transaction lock, even though the only caller then drops it. Change this, and rename it to show it's clearly transaction-specific. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 11:02:55 +10:30
Rusty Russell	dd1b508c63	tdb: cleanup: remove ltype argument from _tdb_transaction_cancel. Now the transaction allrecord lock is the standard one, and thus is cleaned in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to know what type it is. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 12:42:24 +10:30
Rusty Russell	fca1621965	tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade Centralize locking of all chains of the tdb; rename _tdb_lockall to tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and tdb_brlock_upgrade to tdb_allrecord_upgrade. Then we use this in the transaction code. Unfortunately, if the transaction code records that it has grabbed the allrecord lock read-only, write locks will fail, so we treat this upgradable lock as a write lock, and mark it as upgradable using the otherwise-unused offset field. One subtlety: now the transaction code is using the allrecord_lock, the tdb_release_extra_locks() function drops it for us, so we no longer need to do it manually in _tdb_transaction_cancel. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 15:42:15 +10:30
Rusty Russell	caaf5c6baa	tdb: suppress record write locks when allrecord lock is taken. Records themselves get (read) locked by the traversal code against delete. Interestingly, this locking isn't done when the allrecord lock has been taken, though the allrecord lock until recently didn't cover the actual records (it now goes to end of file). The write record lock, grabbed by the delete code, is not suppressed by the allrecord lock. This is now bad: it causes us to punch a hole in the allrecord lock when we release the write record lock. Make this consistent: no record locks of any kind when the allrecord lock is taken. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:45:26 +10:30
Rusty Russell	9341f230f8	tdb: cleanup: always grab allrecord lock to infinity. We were previously inconsistent with our "global" lock: the transaction code grabbed it from FREELIST_TOP to end of file, and the rest of the code grabbed it from FREELIST_TOP to end of the hash chains. Change it to always grab to end of file for simplicity and so we can merge the two. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:45:14 +10:30
Rusty Russell	1ab8776247	tdb: remove num_locks This was redundant before this patch series: it mirrored num_lockrecs exactly. It still does. Also, skip useless branch when locks == 1: unconditional assignment is cheaper anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 15:01:07 +10:30
Rusty Russell	d48c3e4982	tdb: use tdb_nest_lock() for seqnum lock. This is pure overhead, but it centralizes the locking. Realloc (esp. as most implementations are lazy) is fast compared to the fnctl anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:40:57 +10:30
Rusty Russell	4738d474c4	tdb: use tdb_nest_lock() for active lock. Use our newly-generic nested lock tracking for the active lock. Note that the tdb_have_extra_locks() and tdb_release_extra_locks() functions have to skip over this lock now it is tracked. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:44:40 +10:30
Rusty Russell	9136818df3	tdb: use tdb_nest_lock() for open lock. This never nests, so it's overkill, but it centralizes the locking into lock.c and removes the ugly flag in the transaction code to track whether we have the lock or not. Note that we have a temporary hack so this places a real lock, despite the fact that we are in a transaction. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-22 13:58:07 +10:30
Rusty Russell	e8fa70a321	tdb: use tdb_nest_lock() for transaction lock. Rather than a boutique lock and a separate nest count, use our newly-generic nested lock tracking for the transaction lock. Note that the tdb_have_extra_locks() and tdb_release_extra_locks() functions have to skip over this lock now it is tracked. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:37:34 +10:30
Rusty Russell	ce41411c84	tdb: cleanup: find_nestlock() helper. Factor out two loops which find locks; we are going to introduce a couple more so a helper makes sense. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:35:54 +10:30
Rusty Russell	db270734d8	tdb: cleanup: tdb_release_extra_locks() helper Move locking intelligence back into lock.c, rather than open-coding the lock release in transaction.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:41:15 +10:30
Rusty Russell	fba42f1fb4	tdb: cleanup: tdb_have_extra_locks() helper In many places we check whether locks are held: add a helper to do this. The _tdb_lockall() case has already checked for the allrecord lock, so the extra work done by tdb_have_extra_locks() is merely redundant. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:34:26 +10:30
Rusty Russell	b754f61d23	tdb: don't suppress the transaction lock because of the allrecord lock. tdb_transaction_lock() and tdb_transaction_unlock() do nothing if we hold the allrecord lock. However, the two locks don't overlap, so this is wrong. This simplification makes the transaction lock a straight-forward nested lock. There are two callers for these functions: 1) The transaction code, which already makes sure the allrecord_lock isn't held. 2) The traverse code, which wants to stop transactions whether it has the allrecord lock or not. There have been deadlocks here before, however this should not bring them back (I hope!) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:31:49 +10:30
Rusty Russell	5d9de604d9	tdb: cleanup: tdb_nest_lock/tdb_nest_unlock Because fcntl locks don't nest, we track them in the tdb->lockrecs array and only place/release them when the count goes to 1/0. We only do this for record locks, so we simply place the list number (or -1 for the free list) in the structure. To generalize this: 1) Put the offset rather than list number in struct tdb_lock_type. 2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the allrecord check out to the callers (except the mark case which doesn't care). 3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and move the allrecord out to the callers (except mark again). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:26:13 +10:30
Rusty Russell	e9114a7585	tdb: cleanup: rename global_lock to allrecord_lock. The word global is overloaded in tdb. The global_lock inside struct tdb_context is used to indicate we hold a lock across all the chains. Rename it to allrecord_lock. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:19:47 +10:30
Rusty Russell	7ab422d6fb	tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK. The word global is overloaded in tdb. The GLOBAL_LOCK offset is used at open time to serialize initialization (and by the transaction code to block open). Rename it to OPEN_LOCK. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:18:33 +10:30
Rusty Russell	a6e0ef87d2	tdb: make _tdb_transaction_cancel static. Now tdb_open() calls tdb_transaction_cancel() instead of _tdb_transaction_cancel, we can make it static. Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>	2010-02-24 10:39:59 +10:30
Rusty Russell	452b4a5a6e	tdb: cleanup: split brlock and brunlock methods. This is taken from the CCAN code base: rather than using tdb_brlock for locking and unlocking, we split it into brlock and brunlock functions. For extra debugging information, brunlock says what kind of lock it is unlocking (even though fnctl locks don't need this). This requires an extra argument to tdb_transaction_unlock() so we know whether the lock was upgraded to a write lock or not. We also use a "flags" argument tdb_brlock: 1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW). 2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype. 3) TDB_LOCK_PROBE replaces the "probe" argument. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:17:19 +10:30
Brad Hards	09e756b1d6	Spelling fixes for tdb. Signed-off-by: Matthias Dieter Wallnöfer <mwallnoefer@yahoo.de>	2010-02-22 21:45:31 +01:00
Andrew Tridgell	1373e748aa	tdb: use fdatasync() instead of fsync() in transactions This might help on some filesystems	2010-02-13 22:36:11 +11:00
Volker Lendecke	6824c6f46b	tdb: Apply some const, just for clarity	2010-02-13 12:19:09 +01:00
Rusty Russell	b37b452cb8	tdb: fix recovery reuse after crash If a process (or the machine) dies after just after writing the recovery head (pointing at the end of file), the recovery record will filled with 0x42. This will not invoke a recovery on open, since rec.magic != TDB_RECOVERY_MAGIC. Unfortunately, the first transaction commit will happily reuse that area: tdb_recovery_allocate() doesn't check the magic. The recovery record has length 0x42424242, and it writes that back into the now-valid-looking transaction header) for the next comer (which happens to be tdb_wipe_all in my tests). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-10 16:56:14 +10:30
Rusty Russell	6269cdcd15	tdb: give a name to the invalid recovery area constant (0) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-10 16:56:13 +10:30
Volker Lendecke	531059696e	tdb: fix an early release of the global lock that can cause data corruption There was a bug in tdb where the tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1); (ending the transaction-"mutex") was done before the /* remove the recovery marker */ This means that when a transaction is committed there is a window where another opener of the file sees the transaction marker while the transaction committer is still fully functional and working on it. This led to transaction being rolled back by that second opener of the file while transaction_commit() gave no error to the caller. This patch moves the F_UNLCK to after the recovery marker was removed, closing this window.	2010-02-01 15:06:29 +01:00
Stefan Metzmacher	3b9f19ed91	tdb: add TDB_DISALLOW_NESTING and make TDB_ALLOW_NESTING the default behavior We need to keep TDB_ALLOW_NESTING as default behavior, so that existing code continues to work. However we may change the default together with a major version number change in future. metze	2009-11-20 09:45:36 +01:00
Ronnie Sahlberg	436b55db1f	New attempt at TDB transaction nesting allow/disallow. Make the default be that transaction is not allowed and any attempt to create a nested transaction will fail with TDB_ERR_NESTING. If an application can cope with transaction nesting and the implicit semantics of tdb_transaction_commit(), it can enable transaction nesting by using the TDB_ALLOW_NESTING flag. (cherry picked from ctdb commit 3e49e41c21eb8c53084aa8cc7fd3557bdd8eb7b6) Signed-off-by: Stefan Metzmacher <metze@samba.org>	2009-11-20 09:45:34 +01:00
Stefan Metzmacher	85449b7bcc	tdb: always set tdb->tracefd to -1 to be safe on goto fail metze	2009-11-20 09:45:34 +01:00
Volker Lendecke	be88a126ea	tdb: Fix a C++ warning	2009-11-08 00:28:22 +01:00
Kirill Smelkov	b4424f8234	tdb: reset tdb->fd to -1 in tdb_close() So that erroneous double tdb_close() calls do not try to close() same fd again. This is like SAFE_FREE() but for fd. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-10-29 10:14:33 +10:30
Andrew Tridgell	d4c0e8fdf0	tdb: detect tdb store of identical records and skip This can help with ldb where we rewrite the index records	2009-10-25 13:15:18 +11:00

1 2 3 4 5 ...

275 Commits