samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-12 09:18:10 +03:00

Author	SHA1	Message	Date
Rusty Russell	ec96ea690e	tdb: handle processes dying during transaction commit. tdb transactions were designed to be robust against the machine powering off, but interestingly were never designed to handle the case where an administrator kill -9's a process during commit. Because recovery is only done on tdb_open, processes with the tdb already mapped will simply use it despite it being corrupt and needing recovery. The solution to this is to check for recovery every time we grab a data lock: we could have gained the lock because a process just died. This has no measurable cost: here is the time for tdbtorture -s 0 -n 1 -l 10000: Before: 2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75 After: 2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 13:23:58 +10:30
Rusty Russell	1bf482b9ef	patch tdb-refactor-tdb_lock-and-tdb_lock_nonblock.patch	2010-02-24 13:18:06 +10:30
Rusty Russell	8c3fda4318	tdb: don't truncate tdb on recovery The current recovery code truncates the tdb file on recovery. This is fine if recovery is only done on first open, but is a really bad idea as we move to allowing recovery on "live" databases. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:50:41 +10:30
Rusty Russell	9f295eecff	tdb: remove lock ops Now the transaction code uses the standard allrecord lock, that stops us from trying to grab any per-record locks anyway. We don't need to have special noop lock ops for transactions. This is a nice simplification: if you see brlock, you know it's really going to grab a lock. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:49:22 +10:30
Rusty Russell	a84222bbaf	tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks() tdb_release_extra_locks() is too general: it carefully skips over the transaction lock, even though the only caller then drops it. Change this, and rename it to show it's clearly transaction-specific. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 11:02:55 +10:30
Rusty Russell	dd1b508c63	tdb: cleanup: remove ltype argument from _tdb_transaction_cancel. Now the transaction allrecord lock is the standard one, and thus is cleaned in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to know what type it is. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 12:42:24 +10:30
Rusty Russell	fca1621965	tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade Centralize locking of all chains of the tdb; rename _tdb_lockall to tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and tdb_brlock_upgrade to tdb_allrecord_upgrade. Then we use this in the transaction code. Unfortunately, if the transaction code records that it has grabbed the allrecord lock read-only, write locks will fail, so we treat this upgradable lock as a write lock, and mark it as upgradable using the otherwise-unused offset field. One subtlety: now the transaction code is using the allrecord_lock, the tdb_release_extra_locks() function drops it for us, so we no longer need to do it manually in _tdb_transaction_cancel. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 15:42:15 +10:30
Rusty Russell	caaf5c6baa	tdb: suppress record write locks when allrecord lock is taken. Records themselves get (read) locked by the traversal code against delete. Interestingly, this locking isn't done when the allrecord lock has been taken, though the allrecord lock until recently didn't cover the actual records (it now goes to end of file). The write record lock, grabbed by the delete code, is not suppressed by the allrecord lock. This is now bad: it causes us to punch a hole in the allrecord lock when we release the write record lock. Make this consistent: no record locks of any kind when the allrecord lock is taken. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:45:26 +10:30
Rusty Russell	9341f230f8	tdb: cleanup: always grab allrecord lock to infinity. We were previously inconsistent with our "global" lock: the transaction code grabbed it from FREELIST_TOP to end of file, and the rest of the code grabbed it from FREELIST_TOP to end of the hash chains. Change it to always grab to end of file for simplicity and so we can merge the two. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:45:14 +10:30
Rusty Russell	1ab8776247	tdb: remove num_locks This was redundant before this patch series: it mirrored num_lockrecs exactly. It still does. Also, skip useless branch when locks == 1: unconditional assignment is cheaper anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 15:01:07 +10:30
Rusty Russell	d48c3e4982	tdb: use tdb_nest_lock() for seqnum lock. This is pure overhead, but it centralizes the locking. Realloc (esp. as most implementations are lazy) is fast compared to the fnctl anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:40:57 +10:30
Rusty Russell	4738d474c4	tdb: use tdb_nest_lock() for active lock. Use our newly-generic nested lock tracking for the active lock. Note that the tdb_have_extra_locks() and tdb_release_extra_locks() functions have to skip over this lock now it is tracked. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:44:40 +10:30
Rusty Russell	9136818df3	tdb: use tdb_nest_lock() for open lock. This never nests, so it's overkill, but it centralizes the locking into lock.c and removes the ugly flag in the transaction code to track whether we have the lock or not. Note that we have a temporary hack so this places a real lock, despite the fact that we are in a transaction. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-22 13:58:07 +10:30
Rusty Russell	e8fa70a321	tdb: use tdb_nest_lock() for transaction lock. Rather than a boutique lock and a separate nest count, use our newly-generic nested lock tracking for the transaction lock. Note that the tdb_have_extra_locks() and tdb_release_extra_locks() functions have to skip over this lock now it is tracked. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:37:34 +10:30
Rusty Russell	ce41411c84	tdb: cleanup: find_nestlock() helper. Factor out two loops which find locks; we are going to introduce a couple more so a helper makes sense. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:35:54 +10:30
Rusty Russell	db270734d8	tdb: cleanup: tdb_release_extra_locks() helper Move locking intelligence back into lock.c, rather than open-coding the lock release in transaction.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-24 10:41:15 +10:30
Rusty Russell	fba42f1fb4	tdb: cleanup: tdb_have_extra_locks() helper In many places we check whether locks are held: add a helper to do this. The _tdb_lockall() case has already checked for the allrecord lock, so the extra work done by tdb_have_extra_locks() is merely redundant. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:34:26 +10:30
Rusty Russell	b754f61d23	tdb: don't suppress the transaction lock because of the allrecord lock. tdb_transaction_lock() and tdb_transaction_unlock() do nothing if we hold the allrecord lock. However, the two locks don't overlap, so this is wrong. This simplification makes the transaction lock a straight-forward nested lock. There are two callers for these functions: 1) The transaction code, which already makes sure the allrecord_lock isn't held. 2) The traverse code, which wants to stop transactions whether it has the allrecord lock or not. There have been deadlocks here before, however this should not bring them back (I hope!) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:31:49 +10:30
Rusty Russell	5d9de604d9	tdb: cleanup: tdb_nest_lock/tdb_nest_unlock Because fcntl locks don't nest, we track them in the tdb->lockrecs array and only place/release them when the count goes to 1/0. We only do this for record locks, so we simply place the list number (or -1 for the free list) in the structure. To generalize this: 1) Put the offset rather than list number in struct tdb_lock_type. 2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the allrecord check out to the callers (except the mark case which doesn't care). 3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and move the allrecord out to the callers (except mark again). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:26:13 +10:30
Rusty Russell	e9114a7585	tdb: cleanup: rename global_lock to allrecord_lock. The word global is overloaded in tdb. The global_lock inside struct tdb_context is used to indicate we hold a lock across all the chains. Rename it to allrecord_lock. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:19:47 +10:30
Rusty Russell	7ab422d6fb	tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK. The word global is overloaded in tdb. The GLOBAL_LOCK offset is used at open time to serialize initialization (and by the transaction code to block open). Rename it to OPEN_LOCK. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:18:33 +10:30
Rusty Russell	a6e0ef87d2	tdb: make _tdb_transaction_cancel static. Now tdb_open() calls tdb_transaction_cancel() instead of _tdb_transaction_cancel, we can make it static. Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>	2010-02-24 10:39:59 +10:30
Rusty Russell	452b4a5a6e	tdb: cleanup: split brlock and brunlock methods. This is taken from the CCAN code base: rather than using tdb_brlock for locking and unlocking, we split it into brlock and brunlock functions. For extra debugging information, brunlock says what kind of lock it is unlocking (even though fnctl locks don't need this). This requires an extra argument to tdb_transaction_unlock() so we know whether the lock was upgraded to a write lock or not. We also use a "flags" argument tdb_brlock: 1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW). 2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype. 3) TDB_LOCK_PROBE replaces the "probe" argument. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-17 12:17:19 +10:30
Brad Hards	09e756b1d6	Spelling fixes for tdb. Signed-off-by: Matthias Dieter Wallnöfer <mwallnoefer@yahoo.de>	2010-02-22 21:45:31 +01:00
Andrew Tridgell	1373e748aa	tdb: use fdatasync() instead of fsync() in transactions This might help on some filesystems	2010-02-13 22:36:11 +11:00
Volker Lendecke	6824c6f46b	tdb: Apply some const, just for clarity	2010-02-13 12:19:09 +01:00
Rusty Russell	b37b452cb8	tdb: fix recovery reuse after crash If a process (or the machine) dies after just after writing the recovery head (pointing at the end of file), the recovery record will filled with 0x42. This will not invoke a recovery on open, since rec.magic != TDB_RECOVERY_MAGIC. Unfortunately, the first transaction commit will happily reuse that area: tdb_recovery_allocate() doesn't check the magic. The recovery record has length 0x42424242, and it writes that back into the now-valid-looking transaction header) for the next comer (which happens to be tdb_wipe_all in my tests). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-10 16:56:14 +10:30
Rusty Russell	6269cdcd15	tdb: give a name to the invalid recovery area constant (0) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-02-10 16:56:13 +10:30
Volker Lendecke	531059696e	tdb: fix an early release of the global lock that can cause data corruption There was a bug in tdb where the tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1); (ending the transaction-"mutex") was done before the /* remove the recovery marker */ This means that when a transaction is committed there is a window where another opener of the file sees the transaction marker while the transaction committer is still fully functional and working on it. This led to transaction being rolled back by that second opener of the file while transaction_commit() gave no error to the caller. This patch moves the F_UNLCK to after the recovery marker was removed, closing this window.	2010-02-01 15:06:29 +01:00
Stefan Metzmacher	3b9f19ed91	tdb: add TDB_DISALLOW_NESTING and make TDB_ALLOW_NESTING the default behavior We need to keep TDB_ALLOW_NESTING as default behavior, so that existing code continues to work. However we may change the default together with a major version number change in future. metze	2009-11-20 09:45:36 +01:00
Ronnie Sahlberg	436b55db1f	New attempt at TDB transaction nesting allow/disallow. Make the default be that transaction is not allowed and any attempt to create a nested transaction will fail with TDB_ERR_NESTING. If an application can cope with transaction nesting and the implicit semantics of tdb_transaction_commit(), it can enable transaction nesting by using the TDB_ALLOW_NESTING flag. (cherry picked from ctdb commit 3e49e41c21eb8c53084aa8cc7fd3557bdd8eb7b6) Signed-off-by: Stefan Metzmacher <metze@samba.org>	2009-11-20 09:45:34 +01:00
Stefan Metzmacher	85449b7bcc	tdb: always set tdb->tracefd to -1 to be safe on goto fail metze	2009-11-20 09:45:34 +01:00
Volker Lendecke	be88a126ea	tdb: Fix a C++ warning	2009-11-08 00:28:22 +01:00
Kirill Smelkov	b4424f8234	tdb: reset tdb->fd to -1 in tdb_close() So that erroneous double tdb_close() calls do not try to close() same fd again. This is like SAFE_FREE() but for fd. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-10-29 10:14:33 +10:30
Andrew Tridgell	d4c0e8fdf0	tdb: detect tdb store of identical records and skip This can help with ldb where we rewrite the index records	2009-10-25 13:15:18 +11:00
Stefan Metzmacher	3b62e250c0	tdb: rename 'struct list_struct' into 'struct tdb_record' metze	2009-10-23 18:27:20 +02:00
Rusty Russell	022b4d4aa6	lib/tdb: add tdb_check() ctdb wants a quick way to detect corrupt tdbs; particularly, tdbs with loops in their hash chains. tdb_check() provides this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-10-22 00:10:34 +10:30
Rusty Russell	b77f41d58b	lib/tdb: wean off TDB_ERRCODE. It was a regrettable hack which I used to reduce line count in tdb; in fact it caused confusion as can be seen in this patch. In particular, ecode now needs to be set before TDB_LOG anyway, and having it exposed in the header is useless (the struct tdb_context isn't defined, so it's doubly useless). Also, we should never set errno, as io.c was doing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-10-22 00:09:43 +10:30
Rusty Russell	703004340c	lib/tdb: TDB_TRACE support (for developers) When TDB_TRACE is defined (in tdb_private.h), verbose tracing of tdb operations is enabled. This can be replayed using "replay_trace" from http://ccan.ozlabs.org/info/tdb. The majority of this patch comes from moving internal functions to _<funcname> to avoid double-tracing. There should be no additional overhead for the normal (!TDB_TRACE) case. Note that the verbose traces compress really well with rzip. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-10-22 00:00:12 +10:30
Andrew Tridgell	46c99ec2a3	tdb: allow reads after prepare commit We previously only allowed a commit to happen after a prepare commit. It is in fact safe to allow reads between a prepare and a commit, and the s4 replication code can make use of that, so allow it.	2009-09-15 14:04:22 -07:00
Günther Deschner	1c2f4919ab	tdb: fix c++ build warning. Guenther	2009-09-07 11:57:10 +02:00
Rusty Russell	398d0c2929	lib/tdb: don't overwrite TDBs with different version numbers. In future, this may happen, and we don't want to clobber them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-08-28 13:43:05 +10:00
Jeremy Allison	4fc9f9c3f9	Add define guards around otherwise unused variable. Jeremy.	2009-08-06 11:47:08 -07:00
Rusty Russell	252f7da702	There is one signedness issue in tdb which prevents traverses of TDB records over the 2G offset on systems which support 64 bit file offsets. This fixes that case. On systems with 32 bit offsets, expansion and fcntl locking on these records will fail anyway. SAMBA already does '#define _FILE_OFFSET_BITS 64' in config.h (on my 32-bit x86 Linux system at least) to get 64 bit file offsets. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-08-06 13:13:42 +10:00
Rusty Russell	a207cca1d3	tdb: don't alter tdb->flags in tdb_reopen_all() The flags are user-visible, via tdb_get_flags/add_flags/remove_flags. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Stefan Metzmacher <metze@samba.org>	2009-07-31 14:40:28 +02:00
Rusty Russell	3b2f074bda	tdb: Reimplementation of Metze's "lib/tdb: if we know pwrite and pread are thread/fork safe tdb_reopen_all() should be a noop". This version just wraps the reopen code, so we still re-grab the lock and do the normal sanity checks. The reason we do this at all is to avoid global fd limits, see: http://forums.fedoraforum.org/showthread.php?t=210393 Note also that this whole reopen concept is fundamentally racy: if the parent goes away before the child calls tdb_reopen_all, the database can be left without an active lock and another TDB_CLEAR_IF_FIRST opener will clear it. A fork_with_tdbs() wrapper could use a pipe to solve this, but it's hardly elegant (what if there are other independent things which have similar needs?). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Stefan Metzmacher <metze@samba.org>	2009-07-31 14:40:28 +02:00
Rusty Russell	fa91bc6719	tdb: Revert "lib/tdb: if we know pwrite and pread are thread/fork safe tdb_reopen_all() should be a noop" This reverts commit `e17df483fb`. tdb_reopen_all also restores the active lock, required for TDB_CLEAR_IF_FIRST. Signed-off-by: Stefan Metzmacher <metze@samba.org>	2009-07-31 14:40:28 +02:00
Rusty Russell	36c0f0f99a	realloc() has that horrible overloaded free semantic when size is 0: current code does a free of the old record in this case, then fail.	2009-07-30 13:10:33 -07:00
Rusty Russell	a88c281ddc	If the record is at the end of the database, pretending it has length 1 might take us out-of-bounds. Only pretend to be length 1 for the malloc.	2009-07-30 13:09:33 -07:00
Rusty Russell	760104188d	tdb: fix locking error `54a51839ea` "Make tdb transaction lock recursive (samba version)" was broken: I "cleaned it up" and prevented it from ever unlocking. To see the problem: $ bin/tdbtorture -s 1248142523 tdb_brlock failed (fd=3) at offset 8 rw_type=1 lck_type=14 len=1 tdb_transaction_lock: failed to get transaction lock tdb_transaction_start failed: Resource deadlock avoided My testcase relied on the count being correct, which it was. Fixing that now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Michael Adam <obnox@samba.org>	2009-07-21 10:21:53 +02:00
Rusty Russell	54a51839ea	Make tdb transaction lock recursive (samba version) This patch replaces `6ed27edbcd` and `1a416ff13c`, which fixed the bug where traversals inside transactions would release the transaction lock early. This solution is more general, and solves the more minor symptom that nested traversals would also release the transaction lock early. (It was also suggestd in Volker's comment in `6ed27ed`). This patch also applies to ctdb, if the traverse.c part is removed (ctdb's tdb code never received the previous two fixes). Tested using the testsuite from ccan (adapted to the samba code). Thanks to Michael Adam for feedback. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Michael Adam <obnox@samba.org>	2009-07-20 22:17:20 +02:00
Andrew Tridgell	a6cc04a200	overallocate all records by 25% This greatly reduces the fragmentation of databases where records tend to grow slowly by a small amount each time. The case where this is most seen is the ldb index records. Adding this overallocation reduced the size of the resulting database by more than 20x when running a test that adds 10k users.	2009-06-01 13:13:07 +10:00
Andrew Tridgell	a386173fa1	auto-repack in transactions that expand the tdb The idea behind this is to recover from badly fragmented free lists. Choosing the point where the file expands is fairly arbitrary, but seems to work well.	2009-06-01 13:11:39 +10:00
Andrew Tridgell	4b4fec65db	make TDB_NOSYNC affect all the fsync/msync calls in transactions During a transaction commit tdb normally uses fsync/msync calls to make it crash safe. This can be disabled using the TDB_NOSYNC flag, but it wasn't disabling all the code paths that caused a fsync/msync.	2009-05-28 16:08:28 +10:00
Jim McDonough	a91bcbccf8	Detect tight loop in tdb_find()	2009-05-21 16:29:48 -04:00
Tim Prouty	42c0931441	tdb: Remove unused variable	2009-03-31 16:24:07 -07:00
Howard Chu	b90863c0b7	Add tdb_transaction_prepare_commit() Using tdb_transaction_prepare_commit() gives us 2-phase commits. This allows us to safely commit across multiple tdb databases at once, with reasonable transaction semantics Signed-off-by: tridge@samba.org	2009-03-31 13:15:54 +11:00
Stefan Metzmacher	e17df483fb	lib/tdb: if we know pwrite and pread are thread/fork safe tdb_reopen_all() should be a noop The reason for tdb_reopen_all() is that the seek pointer on fds are shared between parent and child. metze	2009-02-25 13:57:11 -08:00
Andrew Tridgell	936d76802f	imported the tdb_repack() code from CTDB The tdb_repack() function repacks a TDB so that it has a single freelist entry. The file doesn't shrink, but it does remove all freelist fragmentation. This code originated in the CTDB vacuuming code, but will now be used in ldb to cope with fragmentation from re-indexing	2008-12-16 14:38:17 +11:00
Jelmer Vernooij	94855cd692	Move common libraries from root to lib/.	2008-09-17 14:11:12 +02:00

1 2 3

110 Commits