1
0
mirror of https://github.com/samba-team/samba.git synced 2025-06-22 07:17:05 +03:00

49 Commits

Author SHA1 Message Date
Volker Lendecke
db5bda56bf tdb: add TDB_MUTEX_LOCKING support
This adds optional support for locking based on
shared robust mutexes.

The caller can use the TDB_MUTEX_LOCKING flag
together with TDB_CLEAR_IF_FIRST after verifying
with tdb_runtime_check_for_robust_mutexes() that
it's supported by the current system.

The caller should be aware that using TDB_MUTEX_LOCKING
implies some limitations, e.g. it's not possible to
have multiple read chainlocks on a given hash chain
from multiple processes.

Note: that this doesn't make tdb thread safe!

Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>
Pair-Programmed-With: Michael Adam <obnox@samba.org>
Signed-off-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2014-05-22 21:05:15 +02:00
Volker Lendecke
d9b4f19e73 tdb: Make tdb_recovery_allocate overflow-safe
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>
2013-06-03 10:21:32 +02:00
Volker Lendecke
8b215df445 tdb: Make tdb_recovery_size overflow-safe
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>
2013-06-03 10:21:31 +02:00
Rusty Russell
3bd686c5ad tdb: fix logging of offets and lengths.
We can have offsets > 2G, so use unsigned values.  Fixes other prints to be
native types rather than casts, too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Reviewed-by: Andrew Bartlett <abartlet@samba.org>

Autobuild-User(master): Andrew Bartlett <abartlet@samba.org>
Autobuild-Date(master): Tue May 28 11:22:14 CEST 2013 on sn-devel-104
2013-05-28 11:22:14 +02:00
Volker Lendecke
a7fdd4f7c2 tdb: Slightly simplify transaction_write
realloc(NULL, ...) is equivalent to malloc. We are already using this
realloc property for tdb->lockrecs. It should not make any difference
in speed, it just makes for a little simpler code.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>

Autobuild-User(master): Stefan Metzmacher <metze@samba.org>
Autobuild-Date(master): Tue Feb 19 17:30:13 CET 2013 on sn-devel-104
2013-02-19 17:30:13 +01:00
Volker Lendecke
72cd5d5ff6 tdb: Remove "header" from tdb_context
header.hash_size was the only thing we ever referenced outside of
tdb_open_ex and its direct callees. So this shrinks the tdb_context by
164 bytes.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>

Autobuild-User(master): Stefan Metzmacher <metze@samba.org>
Autobuild-Date(master): Tue Feb  5 13:18:28 CET 2013 on sn-devel-104
2013-02-05 13:18:28 +01:00
Volker Lendecke
116ec13bb0 tdb: Fix blank line endings
Reviewed-by: Rusty Russell <rusty@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
2012-12-21 11:54:53 +01:00
Volker Lendecke
d2b852d79b tdb: Fix a typo
Reviewed-by: Rusty Russell <rusty@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
2012-12-21 11:54:40 +01:00
Rusty Russell
1783fe3443 tdb: make TDB_NOSYNC merely disable sync.
(As suggested by Stefan Metzmacher, based on the change to ntdb.)

Since commit ec96ea690edbe3398d690b4a953d487ca1773f1c, we handle the case
where a process dies during a transaction commit.  Unfortunately, TDB_NOSYNC
means this no longer works, as it disables the recovery area as well as the
actual msync/fsync.  We should do everything except the syncs.

This also means we can do a complete test with $TDB_NO_FSYNC set; just
to get more complete coverage, we disable it explicitly for one test
(where we override the actual sync calls anyway).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-22 07:35:17 +02:00
Rusty Russell
5767224b7f tdb: don't free old recovery area when expanding if already at EOF.
We allocate a new recovery area by expanding the file.  But if the
recovery area is already at the end of file (as shown in at least one
client case), we can simply expand the record, rather than freeing it
and creating a new one.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Autobuild-User: Rusty Russell <rusty@rustcorp.com.au>
Autobuild-Date: Wed Dec 21 06:25:40 CET 2011 on sn-devel-104
2011-12-21 06:25:40 +01:00
Rusty Russell
3a2a755e33 tdb: use same expansion factor logic when expanding for new recovery area.
If we're expanding because the current recovery area is too small, we
expand only the amount we need.  This can quickly lead to exponential
growth when we have a slowly-expanding record (hence a
slowly-expanding transaction size).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-12-21 14:17:16 +10:30
Rusty Russell
b64494535d tdb: be more careful on 4G files.
I came across a tdb which had wrapped to 4G + 4K, and the contents had been
destroyed by processes which thought it only 4k long.  Fix this by checking
on open, and making tdb_oob() check for wrap itself.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Autobuild-User: Rusty Russell <rusty@rustcorp.com.au>
Autobuild-Date: Mon Dec 19 07:52:01 CET 2011 on sn-devel-104
2011-12-19 07:52:01 +01:00
Rusty Russell
36cfa7b79e tdb: make sure we skip over recovery area correctly.
If it's really the recovery area, we can trust the rec_len field, and
don't have to go groping for bitpatterns.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Autobuild-User: Rusty Russell <rusty@rustcorp.com.au>
Autobuild-Date: Tue Apr 19 14:15:22 CEST 2011 on sn-devel-104
2011-04-19 14:15:22 +02:00
Rusty Russell
094ab60053 tdb: tdb_repack() only when it's worthwhile.
tdb_repack() is expensive and consumes memory, so we can spend some
effort to see if it's worthwhile.  In particular, tdbbackup doesn't
need to repack: it started with an empty database!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-04-18 22:15:11 +09:30
Rusty Russell
6aa72dae8f tdb: fix transaction recovery area for converted tdbs.
This is why macros are dangerous; these were converting the pointers, not the
things pointed to!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-04-18 22:15:11 +09:30
Jelmer Vernooij
62c4af9942 tdb: Set _PUBLIC_ in C file rather than header files (Debian bug 600898)
Autobuild-User: Jelmer Vernooij <jelmer@samba.org>
Autobuild-Date: Thu Oct 21 11:47:22 UTC 2010 on sn-devel-104
2010-10-21 11:47:22 +00:00
Günther Deschner
f7a3bd4fa4 tdb: fix the build on mac os x 10.6.4.
Guenther
2010-07-01 23:14:57 +02:00
Volker Lendecke
261c3b4f1b tdb: Add a non-blocking version of tdb_transaction_start 2010-03-26 14:27:47 -04:00
Volker Lendecke
ea8e0d5d54 Fix some nonempty blank lines 2010-03-25 10:24:45 +01:00
Rusty Russell
ec96ea690e tdb: handle processes dying during transaction commit.
tdb transactions were designed to be robust against the machine
powering off, but interestingly were never designed to handle the case
where an administrator kill -9's a process during commit.  Because
recovery is only done on tdb_open, processes with the tdb already
mapped will simply use it despite it being corrupt and needing
recovery.

The solution to this is to check for recovery every time we grab a
data lock: we could have gained the lock because a process just died.
This has no measurable cost: here is the time for tdbtorture -s 0 -n 1
-l 10000:

Before:
	2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75

After:
	2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 13:23:58 +10:30
Rusty Russell
8c3fda4318 tdb: don't truncate tdb on recovery
The current recovery code truncates the tdb file on recovery.  This is
fine if recovery is only done on first open, but is a really bad idea
as we move to allowing recovery on "live" databases.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 10:50:41 +10:30
Rusty Russell
9f295eecff tdb: remove lock ops
Now the transaction code uses the standard allrecord lock, that stops
us from trying to grab any per-record locks anyway.  We don't need to
have special noop lock ops for transactions.

This is a nice simplification: if you see brlock, you know it's really
going to grab a lock.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 10:49:22 +10:30
Rusty Russell
a84222bbaf tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()
tdb_release_extra_locks() is too general: it carefully skips over the
transaction lock, even though the only caller then drops it.  Change
this, and rename it to show it's clearly transaction-specific.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 11:02:55 +10:30
Rusty Russell
dd1b508c63 tdb: cleanup: remove ltype argument from _tdb_transaction_cancel.
Now the transaction allrecord lock is the standard one, and thus is cleaned
in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to
know what type it is.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 12:42:24 +10:30
Rusty Russell
fca1621965 tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade
Centralize locking of all chains of the tdb; rename _tdb_lockall to
tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and
tdb_brlock_upgrade to tdb_allrecord_upgrade.

Then we use this in the transaction code.  Unfortunately, if the transaction
code records that it has grabbed the allrecord lock read-only, write locks
will fail, so we treat this upgradable lock as a write lock, and mark it
as upgradable using the otherwise-unused offset field.

One subtlety: now the transaction code is using the allrecord_lock, the
tdb_release_extra_locks() function drops it for us, so we no longer need
to do it manually in _tdb_transaction_cancel.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 15:42:15 +10:30
Rusty Russell
9136818df3 tdb: use tdb_nest_lock() for open lock.
This never nests, so it's overkill, but it centralizes the locking into
lock.c and removes the ugly flag in the transaction code to track whether
we have the lock or not.

Note that we have a temporary hack so this places a real lock, despite
the fact that we are in a transaction.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-22 13:58:07 +10:30
Rusty Russell
db270734d8 tdb: cleanup: tdb_release_extra_locks() helper
Move locking intelligence back into lock.c, rather than open-coding the
lock release in transaction.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 10:41:15 +10:30
Rusty Russell
fba42f1fb4 tdb: cleanup: tdb_have_extra_locks() helper
In many places we check whether locks are held: add a helper to do this.

The _tdb_lockall() case has already checked for the allrecord lock, so
the extra work done by tdb_have_extra_locks() is merely redundant.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:34:26 +10:30
Rusty Russell
5d9de604d9 tdb: cleanup: tdb_nest_lock/tdb_nest_unlock
Because fcntl locks don't nest, we track them in the tdb->lockrecs array
and only place/release them when the count goes to 1/0.  We only do this
for record locks, so we simply place the list number (or -1 for the free
list) in the structure.

To generalize this:

1) Put the offset rather than list number in struct tdb_lock_type.
2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the
   allrecord check out to the callers (except the mark case which doesn't
   care).
3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and
   move the allrecord out to the callers (except mark again).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:26:13 +10:30
Rusty Russell
e9114a7585 tdb: cleanup: rename global_lock to allrecord_lock.
The word global is overloaded in tdb.  The global_lock inside struct
tdb_context is used to indicate we hold a lock across all the chains.

Rename it to allrecord_lock.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:19:47 +10:30
Rusty Russell
7ab422d6fb tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.
The word global is overloaded in tdb.  The GLOBAL_LOCK offset is used at
open time to serialize initialization (and by the transaction code to block
open).

Rename it to OPEN_LOCK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:18:33 +10:30
Rusty Russell
a6e0ef87d2 tdb: make _tdb_transaction_cancel static.
Now tdb_open() calls tdb_transaction_cancel() instead of
_tdb_transaction_cancel, we can make it static.

Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>
2010-02-24 10:39:59 +10:30
Rusty Russell
452b4a5a6e tdb: cleanup: split brlock and brunlock methods.
This is taken from the CCAN code base: rather than using tdb_brlock for
locking and unlocking, we split it into brlock and brunlock functions.

For extra debugging information, brunlock says what kind of lock it is
unlocking (even though fnctl locks don't need this).  This requires an
extra argument to tdb_transaction_unlock() so we know whether the
lock was upgraded to a write lock or not.

We also use a "flags" argument tdb_brlock:
1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW).
2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype.
3) TDB_LOCK_PROBE replaces the "probe" argument.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:17:19 +10:30
Andrew Tridgell
1373e748aa tdb: use fdatasync() instead of fsync() in transactions
This might help on some filesystems
2010-02-13 22:36:11 +11:00
Volker Lendecke
6824c6f46b tdb: Apply some const, just for clarity 2010-02-13 12:19:09 +01:00
Rusty Russell
b37b452cb8 tdb: fix recovery reuse after crash
If a process (or the machine) dies after just after writing the
recovery head (pointing at the end of file), the recovery record will filled
with 0x42.  This will not invoke a recovery on open, since rec.magic
!= TDB_RECOVERY_MAGIC.

Unfortunately, the first transaction commit will happily reuse that
area: tdb_recovery_allocate() doesn't check the magic.  The recovery
record has length 0x42424242, and it writes that back into the
now-valid-looking transaction header) for the next comer (which
happens to be tdb_wipe_all in my tests).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-10 16:56:14 +10:30
Rusty Russell
6269cdcd15 tdb: give a name to the invalid recovery area constant (0)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-10 16:56:13 +10:30
Volker Lendecke
531059696e tdb: fix an early release of the global lock that can cause data corruption
There was a bug in tdb where the

                tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);

(ending the transaction-"mutex") was done before the

                        /* remove the recovery marker */

This means that when a transaction is committed there is a window where another
opener of the file sees the transaction marker while the transaction committer
is still fully functional and working on it. This led to transaction being
rolled back by that second opener of the file while transaction_commit() gave
no error to the caller.

This patch moves the F_UNLCK to after the recovery marker was removed, closing
this window.
2010-02-01 15:06:29 +01:00
Stefan Metzmacher
3b9f19ed91 tdb: add TDB_DISALLOW_NESTING and make TDB_ALLOW_NESTING the default behavior
We need to keep TDB_ALLOW_NESTING as default behavior,
so that existing code continues to work.

However we may change the default together with a major version
number change in future.

metze
2009-11-20 09:45:36 +01:00
Ronnie Sahlberg
436b55db1f New attempt at TDB transaction nesting allow/disallow.
Make the default be that transaction is not allowed and any attempt to create a nested transaction will fail with TDB_ERR_NESTING.

If an application can cope with transaction nesting and the implicit
semantics of tdb_transaction_commit(), it can enable transaction nesting
by using the TDB_ALLOW_NESTING flag.
(cherry picked from ctdb commit 3e49e41c21eb8c53084aa8cc7fd3557bdd8eb7b6)

Signed-off-by: Stefan Metzmacher <metze@samba.org>
2009-11-20 09:45:34 +01:00
Stefan Metzmacher
3b62e250c0 tdb: rename 'struct list_struct' into 'struct tdb_record'
metze
2009-10-23 18:27:20 +02:00
Rusty Russell
b77f41d58b lib/tdb: wean off TDB_ERRCODE.
It was a regrettable hack which I used to reduce line count in tdb; in fact it caused confusion as can be seen in this patch.
In particular, ecode now needs to be set before TDB_LOG anyway, and having it exposed in
the header is useless (the struct tdb_context isn't defined, so it's doubly useless).
Also, we should never set errno, as io.c was doing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 00:09:43 +10:30
Rusty Russell
703004340c lib/tdb: TDB_TRACE support (for developers)
When TDB_TRACE is defined (in tdb_private.h), verbose tracing of tdb operations is enabled.
This can be replayed using "replay_trace" from http://ccan.ozlabs.org/info/tdb.

The majority of this patch comes from moving internal functions to _<funcname> to
avoid double-tracing.  There should be no additional overhead for the normal (!TDB_TRACE)
case.

Note that the verbose traces compress really well with rzip.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 00:00:12 +10:30
Andrew Tridgell
46c99ec2a3 tdb: allow reads after prepare commit
We previously only allowed a commit to happen after a prepare
commit. It is in fact safe to allow reads between a prepare and a
commit, and the s4 replication code can make use of that, so allow it.
2009-09-15 14:04:22 -07:00
Andrew Tridgell
a386173fa1 auto-repack in transactions that expand the tdb
The idea behind this is to recover from badly fragmented free
lists. Choosing the point where the file expands is fairly arbitrary,
but seems to work well.
2009-06-01 13:11:39 +10:00
Andrew Tridgell
4b4fec65db make TDB_NOSYNC affect all the fsync/msync calls in transactions
During a transaction commit tdb normally uses fsync/msync calls to
make it crash safe. This can be disabled using the TDB_NOSYNC flag,
but it wasn't disabling all the code paths that caused a fsync/msync.
2009-05-28 16:08:28 +10:00
Tim Prouty
42c0931441 tdb: Remove unused variable 2009-03-31 16:24:07 -07:00
Howard Chu
b90863c0b7 Add tdb_transaction_prepare_commit()
Using tdb_transaction_prepare_commit() gives us 2-phase commits. This
allows us to safely commit across multiple tdb databases at once, with
reasonable transaction semantics

Signed-off-by: tridge@samba.org
2009-03-31 13:15:54 +11:00
Jelmer Vernooij
94855cd692 Move common libraries from root to lib/. 2008-09-17 14:11:12 +02:00