1
0
mirror of https://github.com/samba-team/samba.git synced 2025-06-22 07:17:05 +03:00

42 Commits

Author SHA1 Message Date
Volker Lendecke
db5bda56bf tdb: add TDB_MUTEX_LOCKING support
This adds optional support for locking based on
shared robust mutexes.

The caller can use the TDB_MUTEX_LOCKING flag
together with TDB_CLEAR_IF_FIRST after verifying
with tdb_runtime_check_for_robust_mutexes() that
it's supported by the current system.

The caller should be aware that using TDB_MUTEX_LOCKING
implies some limitations, e.g. it's not possible to
have multiple read chainlocks on a given hash chain
from multiple processes.

Note: that this doesn't make tdb thread safe!

Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>
Pair-Programmed-With: Michael Adam <obnox@samba.org>
Signed-off-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2014-05-22 21:05:15 +02:00
Volker Lendecke
cbd73ba163 tdb: introduce tdb->hdr_ofs
This makes it possible to have some extra headers before
the real tdb content starts in the file.

This will be used used e.g. to implement locking based on robust mutexes.

Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>
Pair-Programmed-With: Michael Adam <obnox@samba.org>
Signed-off-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2014-05-22 21:05:15 +02:00
Stefan Metzmacher
c29e64d97e tdb: introduce TDB_SUPPORTED_FEATURE_FLAGS
This will allow to store a feature mask in the tdb header on disk,
so that openers can check if they can handle the features
other openers are using.

Pair-Programmed-With: Volker Lendecke <vl@samba.org>
Pair-Programmed-With: Michael Adam <obnox@samba.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2014-05-22 21:05:15 +02:00
Volker Lendecke
3034a5a62b tdb: Reduce freelist contention
In a metadata-intensive benchmark we have seen the locking.tdb freelist to be
one of the central contention points. This patch removes most of the contention
on the freelist. Ages ago we already reduced freelist contention by using the
even much older DEAD records: If TDB_VOLATILE is set, don't directly put
deleted records on the freelist, but just mark a few of them just as DEAD. The
next new record can them re-use that space without consulting the freelist.

This patch builds upon the DEAD records: If we need space and the freelist is
busy, instead of doing a blocking wait on the freelist, start looking into
other chains for DEAD records and steal them from there. This way every hash
chain becomes a small freelist. Just wander around the hash chains as long as
the freelist is still busy.

With this patch and the tdb mutex patch (following hopefully some time soon)
you can see a heavily busy clustered smbd run without locking.tdb futex
syscalls.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
2014-03-18 13:42:10 +01:00
Volker Lendecke
1461362e93 tdb: Make "tdb_purge_dead" internally public
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
2014-03-18 13:42:10 +01:00
Volker Lendecke
92ce9fd9af tdb: Make "tdb_find_dead" internally public
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
2014-03-18 13:42:10 +01:00
Volker Lendecke
f3556bd03b tdb: Avoid reallocs for lockrecs
In normal operations we have at most 3 entries in this array. Don't
bother with shrinking.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>

Autobuild-User(master): Stefan Metzmacher <metze@samba.org>
Autobuild-Date(master): Sat Dec 14 13:19:47 CET 2013 on sn-devel-104
2013-12-14 13:19:47 +01:00
Volker Lendecke
8b215df445 tdb: Make tdb_recovery_size overflow-safe
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>
2013-06-03 10:21:31 +02:00
Volker Lendecke
4483bf143d tdb: Add overflow-checking tdb_add_off_t
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>
2013-06-03 10:21:20 +02:00
Volker Lendecke
72cd5d5ff6 tdb: Remove "header" from tdb_context
header.hash_size was the only thing we ever referenced outside of
tdb_open_ex and its direct callees. So this shrinks the tdb_context by
164 bytes.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>

Autobuild-User(master): Stefan Metzmacher <metze@samba.org>
Autobuild-Date(master): Tue Feb  5 13:18:28 CET 2013 on sn-devel-104
2013-02-05 13:18:28 +01:00
Volker Lendecke
116ec13bb0 tdb: Fix blank line endings
Reviewed-by: Rusty Russell <rusty@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
2012-12-21 11:54:53 +01:00
Rusty Russell
fde694274e lib/tdb: fix OpenBSD incoherent mmap.
This comment appears in two places in the code (commit
4c6a8273c6dd3e2aeda5a63c4a62aa55bc133099 from 2001):

	/*
	 * We must ensure the file is unmapped before doing this
	 * to ensure consistency with systems like OpenBSD where
	 * writes and mmaps are not consistent.
	 */

But this doesn't help, because if one process is using mmap and another
using pwrite, we get incoherent results.  As demonstrated by OpenBSD's
failure on the tdb unit tests.

Rather than disable mmap on OpenBSD, we test for this issue and force mmap
to be enabled.  This means that we will fail on very large TDBs on 32-bit
systems, but it's better than the horrendous performance penalty on every
OpenBSD system.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-03-22 01:57:37 +01:00
Rusty Russell
390b9a2dd8 tdb: make tdb_private.h idempotent.
The most convenient way to write unit tests in C is to directly
#include the C files (CCAN uses this, for example).  That works quite
well, but it means that tdb_private.h now needs to be protected
against multiple inclusions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-02-14 04:04:43 +10:30
Rusty Russell
3a2a755e33 tdb: use same expansion factor logic when expanding for new recovery area.
If we're expanding because the current recovery area is too small, we
expand only the amount we need.  This can quickly lead to exponential
growth when we have a slowly-expanding record (hence a
slowly-expanding transaction size).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-12-21 14:17:16 +10:30
Rusty Russell
b64494535d tdb: be more careful on 4G files.
I came across a tdb which had wrapped to 4G + 4K, and the contents had been
destroyed by processes which thought it only 4k long.  Fix this by checking
on open, and making tdb_oob() check for wrap itself.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Autobuild-User: Rusty Russell <rusty@rustcorp.com.au>
Autobuild-Date: Mon Dec 19 07:52:01 CET 2011 on sn-devel-104
2011-12-19 07:52:01 +01:00
Rusty Russell
36cfa7b79e tdb: make sure we skip over recovery area correctly.
If it's really the recovery area, we can trust the rec_len field, and
don't have to go groping for bitpatterns.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Autobuild-User: Rusty Russell <rusty@rustcorp.com.au>
Autobuild-Date: Tue Apr 19 14:15:22 CEST 2011 on sn-devel-104
2011-04-19 14:15:22 +02:00
Rusty Russell
cac57328a6 tdb: tdb_summary() support.
Autobuild-User: Rusty Russell <rusty@rustcorp.com.au>
Autobuild-Date: Wed Dec 29 10:12:05 CET 2010 on sn-devel-104
2010-12-29 10:12:05 +01:00
Rusty Russell
2dcf76c924 tdb: TDB_INCOMPATIBLE_HASH, to allow safe changing of default hash.
This flag to tdb_open/tdb_open_ex effects creation of a new database:
1) Uses the Jenkins lookup3 hash instead of the old gdbm hash if none is
   specified,
2) Places a non-zero field in header->rwlocks, so older versions of TDB will
   refuse to open it.

This means that the caller (ie Samba) can set this flag to safely
change the hash function.  Versions of TDB from this one on will either
use the correct hash or refuse to open (if a different hash is specified).
Older TDB versions will see the nonzero rwlocks field and refuse to open
it under any conditions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-09-27 10:48:28 +09:30
Rusty Russell
3258cf3f11 tdb: add Bob Jenkins lookup3 hash as helper hash.
This is a better hash than the default: shipping it with tdb makes it easy
for callers to use it as the hash by passing it to tdb_open_ex().

This version taken from CCAN and modified, which took it from
http://www.burtleburtle.net/bob/c/lookup3.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-09-27 10:48:28 +09:30
Rusty Russell
786b726300 tdb: put example hashes into header, so we notice incorrect hash_fn.
This is Stefan Metzmacher <metze@samba.org>'s patch with minor changes:
1) Use the TDB_MAGIC constant so both hashes aren't of strings.
2) Check the hash in tdb_check (paranoia, really).
3) Additional check in the (unlikely!) case where both examples hash to 0.
4) Cosmetic changes to var names and complaint message.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-09-13 20:05:59 +09:30
Rusty Russell
91e4a1760d tdb: fix short write logic in tdb_new_database
Commit 207a213c/24fed55d purported to fix the problem of signals during
tdb_new_database (which could cause a spurious short write, hence a failure).
However, the code is wrong: newdb+written is not correct.

Fix this by introducing a general tdb_write_all() and using it here and in
the tracing code.

Cc: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-05 15:37:18 +09:30
Andrew Tridgell
773a8afbba tdb: update tdb ABI to use hide_symbols=True
We now use -fvisibilty=hidden to hide symbols from outside the tdb
shared library.

This also moved tdb_transaction_recover() into the tdb_private.h
header, as it should never have been a public API. For that reason we
are changing the version number. We're only doing a minor version
increment as it is extremely unlikely that anyone was actually using
tdb_transaction_recover() as its locking requirements were rather
unusual.

Pair-Programmed-With: Rusty Russell <rusty@samba.org>
2010-04-20 15:50:27 +10:00
Volker Lendecke
261c3b4f1b tdb: Add a non-blocking version of tdb_transaction_start 2010-03-26 14:27:47 -04:00
Volker Lendecke
ea8e0d5d54 Fix some nonempty blank lines 2010-03-25 10:24:45 +01:00
Rusty Russell
ec96ea690e tdb: handle processes dying during transaction commit.
tdb transactions were designed to be robust against the machine
powering off, but interestingly were never designed to handle the case
where an administrator kill -9's a process during commit.  Because
recovery is only done on tdb_open, processes with the tdb already
mapped will simply use it despite it being corrupt and needing
recovery.

The solution to this is to check for recovery every time we grab a
data lock: we could have gained the lock because a process just died.
This has no measurable cost: here is the time for tdbtorture -s 0 -n 1
-l 10000:

Before:
	2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75

After:
	2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 13:23:58 +10:30
Rusty Russell
9f295eecff tdb: remove lock ops
Now the transaction code uses the standard allrecord lock, that stops
us from trying to grab any per-record locks anyway.  We don't need to
have special noop lock ops for transactions.

This is a nice simplification: if you see brlock, you know it's really
going to grab a lock.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 10:49:22 +10:30
Rusty Russell
a84222bbaf tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()
tdb_release_extra_locks() is too general: it carefully skips over the
transaction lock, even though the only caller then drops it.  Change
this, and rename it to show it's clearly transaction-specific.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 11:02:55 +10:30
Rusty Russell
fca1621965 tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade
Centralize locking of all chains of the tdb; rename _tdb_lockall to
tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and
tdb_brlock_upgrade to tdb_allrecord_upgrade.

Then we use this in the transaction code.  Unfortunately, if the transaction
code records that it has grabbed the allrecord lock read-only, write locks
will fail, so we treat this upgradable lock as a write lock, and mark it
as upgradable using the otherwise-unused offset field.

One subtlety: now the transaction code is using the allrecord_lock, the
tdb_release_extra_locks() function drops it for us, so we no longer need
to do it manually in _tdb_transaction_cancel.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 15:42:15 +10:30
Rusty Russell
1ab8776247 tdb: remove num_locks
This was redundant before this patch series: it mirrored num_lockrecs
exactly.  It still does.

Also, skip useless branch when locks == 1: unconditional assignment is
cheaper anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 15:01:07 +10:30
Rusty Russell
e8fa70a321 tdb: use tdb_nest_lock() for transaction lock.
Rather than a boutique lock and a separate nest count, use our
newly-generic nested lock tracking for the transaction lock.

Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:37:34 +10:30
Rusty Russell
db270734d8 tdb: cleanup: tdb_release_extra_locks() helper
Move locking intelligence back into lock.c, rather than open-coding the
lock release in transaction.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 10:41:15 +10:30
Rusty Russell
fba42f1fb4 tdb: cleanup: tdb_have_extra_locks() helper
In many places we check whether locks are held: add a helper to do this.

The _tdb_lockall() case has already checked for the allrecord lock, so
the extra work done by tdb_have_extra_locks() is merely redundant.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:34:26 +10:30
Rusty Russell
5d9de604d9 tdb: cleanup: tdb_nest_lock/tdb_nest_unlock
Because fcntl locks don't nest, we track them in the tdb->lockrecs array
and only place/release them when the count goes to 1/0.  We only do this
for record locks, so we simply place the list number (or -1 for the free
list) in the structure.

To generalize this:

1) Put the offset rather than list number in struct tdb_lock_type.
2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the
   allrecord check out to the callers (except the mark case which doesn't
   care).
3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and
   move the allrecord out to the callers (except mark again).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:26:13 +10:30
Rusty Russell
e9114a7585 tdb: cleanup: rename global_lock to allrecord_lock.
The word global is overloaded in tdb.  The global_lock inside struct
tdb_context is used to indicate we hold a lock across all the chains.

Rename it to allrecord_lock.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:19:47 +10:30
Rusty Russell
7ab422d6fb tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.
The word global is overloaded in tdb.  The GLOBAL_LOCK offset is used at
open time to serialize initialization (and by the transaction code to block
open).

Rename it to OPEN_LOCK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:18:33 +10:30
Rusty Russell
a6e0ef87d2 tdb: make _tdb_transaction_cancel static.
Now tdb_open() calls tdb_transaction_cancel() instead of
_tdb_transaction_cancel, we can make it static.

Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>
2010-02-24 10:39:59 +10:30
Rusty Russell
452b4a5a6e tdb: cleanup: split brlock and brunlock methods.
This is taken from the CCAN code base: rather than using tdb_brlock for
locking and unlocking, we split it into brlock and brunlock functions.

For extra debugging information, brunlock says what kind of lock it is
unlocking (even though fnctl locks don't need this).  This requires an
extra argument to tdb_transaction_unlock() so we know whether the
lock was upgraded to a write lock or not.

We also use a "flags" argument tdb_brlock:
1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW).
2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype.
3) TDB_LOCK_PROBE replaces the "probe" argument.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-17 12:17:19 +10:30
Rusty Russell
6269cdcd15 tdb: give a name to the invalid recovery area constant (0)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-10 16:56:13 +10:30
Stefan Metzmacher
3b62e250c0 tdb: rename 'struct list_struct' into 'struct tdb_record'
metze
2009-10-23 18:27:20 +02:00
Rusty Russell
703004340c lib/tdb: TDB_TRACE support (for developers)
When TDB_TRACE is defined (in tdb_private.h), verbose tracing of tdb operations is enabled.
This can be replayed using "replay_trace" from http://ccan.ozlabs.org/info/tdb.

The majority of this patch comes from moving internal functions to _<funcname> to
avoid double-tracing.  There should be no additional overhead for the normal (!TDB_TRACE)
case.

Note that the verbose traces compress really well with rzip.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 00:00:12 +10:30
Rusty Russell
54a51839ea Make tdb transaction lock recursive (samba version)
This patch replaces 6ed27edbcd3ba1893636a8072c8d7a621437daf7 and
1a416ff13ca7786f2e8d24c66addf00883e9cb12, which fixed the bug where traversals
inside transactions would release the transaction lock early.

This solution is more general, and solves the more minor symptom that nested
traversals would also release the transaction lock early.  (It was also suggestd in
Volker's comment in 6ed27ed).

This patch also applies to ctdb, if the traverse.c part is removed (ctdb's tdb
code never received the previous two fixes).

Tested using the testsuite from ccan (adapted to the samba code).  Thanks to
Michael Adam for feedback.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael Adam <obnox@samba.org>
2009-07-20 22:17:20 +02:00
Jelmer Vernooij
94855cd692 Move common libraries from root to lib/. 2008-09-17 14:11:12 +02:00