IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
In 2dcf76 Rusty added TDB_INCOMPATIBLE_HASH open flag which selects
Jenkins lookup3 hash for new databases.
Expose this flag to python users too.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
This flag to tdb_open/tdb_open_ex effects creation of a new database:
1) Uses the Jenkins lookup3 hash instead of the old gdbm hash if none is
specified,
2) Places a non-zero field in header->rwlocks, so older versions of TDB will
refuse to open it.
This means that the caller (ie Samba) can set this flag to safely
change the hash function. Versions of TDB from this one on will either
use the correct hash or refuse to open (if a different hash is specified).
Older TDB versions will see the nonzero rwlocks field and refuse to open
it under any conditions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If the caller to tdb_open_ex() doesn't specify a hash, and tdb_old_hash
doesn't match, try tdb_jenkins_hash.
This was Metze's idea: it makes life simpler, especially with the upcoming
TDB_INCOMPATIBLE_HASH flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a better hash than the default: shipping it with tdb makes it easy
for callers to use it as the hash by passing it to tdb_open_ex().
This version taken from CCAN and modified, which took it from
http://www.burtleburtle.net/bob/c/lookup3.c.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The problem was tdb->name is NULL for TDB_INTERNAL databases, and
so it was crashing ...
#0 0xb76944f3 in strlen () from /lib/i686/cmov/libc.so.6
#1 0x0809862b in PyString_FromFormatV (format=0xb72b6a26 "Tdb('%s')", vargs=0xbfc26a94 "")
at ../Objects/stringobject.c:211
#2 0x08098888 in PyString_FromFormat (format=0xb72b6a26 "Tdb('%s')") at ../Objects/stringobject.c:358
#3 0xb72b65f2 in tdb_object_repr (self=0xb759e060) at ./pytdb.c:439
Cc: 597089@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
Note, unlike tdb_open where flags is `int', tdb_{add,remove}_flags want
flags as `unsigned', so instead of "i" I used "I" in PyArg_ParseTuple.
Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
This is Stefan Metzmacher <metze@samba.org>'s patch with minor changes:
1) Use the TDB_MAGIC constant so both hashes aren't of strings.
2) Check the hash in tdb_check (paranoia, really).
3) Additional check in the (unlikely!) case where both examples hash to 0.
4) Cosmetic changes to var names and complaint message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Commit bc1c82ea13 "Fix tdb_check() to work with read-only tdb databases."
claimed to do this, but tdb_lockall_read() fails on read-only databases.
Also make sure we can still do tdb_check() inside a transaction (weird,
but we previously allowed it so don't break the API).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can end up with dead areas when we die during transaction commit;
tdb_check() fails on such a (valid) database.
This is particularly noticable now we no longer truncate on recovery;
if the recovery area was at the end of the file we used to remove it
that way.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We saw tdb_lockall() take 71 seconds under heavy load; this is because Linux
(at least) doesn't prevent new small locks being obtained while we're waiting
for a big log.
The workaround is to do divide and conquer using non-blocking chainlocks: if
we get down to a single chain we block. Using a simple test program where
children did "hold lock for 100ms, sleep for 1 second" the time to do
tdb_lockall() dropped signifiantly. There are ln(hashsize) locks taken in
the contended case, but that's slow anyway.
More analysis is given in my blog at http://rusty.ozlabs.org/?p=120
This may also help transactions, though in that case it's the initial
read lock which uses this gradual locking routine; the update-to-write-lock
code is separate and still tries to update in one go.
Even though ABI doesn't change, minor version bumped so behavior change
can be easily detected.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Commit 207a213c/24fed55d purported to fix the problem of signals during
tdb_new_database (which could cause a spurious short write, hence a failure).
However, the code is wrong: newdb+written is not correct.
Fix this by introducing a general tdb_write_all() and using it here and in
the tracing code.
Cc: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now use -fvisibilty=hidden to hide symbols from outside the tdb
shared library.
This also moved tdb_transaction_recover() into the tdb_private.h
header, as it should never have been a public API. For that reason we
are changing the version number. We're only doing a minor version
increment as it is extremely unlikely that anyone was actually using
tdb_transaction_recover() as its locking requirements were rather
unusual.
Pair-Programmed-With: Rusty Russell <rusty@samba.org>
Upstream subunit makes a ":" after commands optional, so I've fixed any
places where we might trigger commands accidently. I've filed a bug
about this in subunit.
when we use a system version of a library such as talloc, then we
no longer get the automtica dependency propogation of talloc implying
libreplace. That means we don't get the includes for libreplace, which
means things can fail to build.
To fix this this change adds an implied_deps option to
CHECK_BUNDLED_SYSTEM(), which tells the samba_deps module to add an
implied dependency on the listed targets if the system library is
chosen.
distros can set --bundled-libraries=NONE to force use of all system
libraries. If the right version isn't found then configure will fail.
Users may choose which libraries to use from the system, and which to
use bundled libs. The default is to try system libs, and use them if
their version matches the one in the source tree.
tdb transactions were designed to be robust against the machine
powering off, but interestingly were never designed to handle the case
where an administrator kill -9's a process during commit. Because
recovery is only done on tdb_open, processes with the tdb already
mapped will simply use it despite it being corrupt and needing
recovery.
The solution to this is to check for recovery every time we grab a
data lock: we could have gained the lock because a process just died.
This has no measurable cost: here is the time for tdbtorture -s 0 -n 1
-l 10000:
Before:
2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75
After:
2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To test the case of death of a process during transaction commit, add
a -k (kill random) option to tdbtorture. The easiest way to do this
is to make every worker a child (unless there's only one child), which
is why this patch is bigger than you might expect.
Using -k without -t (always transactions) you expect corruption, though
it doesn't happen every time. With -t, we currently get corruption but
the next patch fixes that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The current recovery code truncates the tdb file on recovery. This is
fine if recovery is only done on first open, but is a really bad idea
as we move to allowing recovery on "live" databases.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now the transaction code uses the standard allrecord lock, that stops
us from trying to grab any per-record locks anyway. We don't need to
have special noop lock ops for transactions.
This is a nice simplification: if you see brlock, you know it's really
going to grab a lock.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
tdb_release_extra_locks() is too general: it carefully skips over the
transaction lock, even though the only caller then drops it. Change
this, and rename it to show it's clearly transaction-specific.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now the transaction allrecord lock is the standard one, and thus is cleaned
in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to
know what type it is.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Centralize locking of all chains of the tdb; rename _tdb_lockall to
tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and
tdb_brlock_upgrade to tdb_allrecord_upgrade.
Then we use this in the transaction code. Unfortunately, if the transaction
code records that it has grabbed the allrecord lock read-only, write locks
will fail, so we treat this upgradable lock as a write lock, and mark it
as upgradable using the otherwise-unused offset field.
One subtlety: now the transaction code is using the allrecord_lock, the
tdb_release_extra_locks() function drops it for us, so we no longer need
to do it manually in _tdb_transaction_cancel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Records themselves get (read) locked by the traversal code against delete.
Interestingly, this locking isn't done when the allrecord lock has been
taken, though the allrecord lock until recently didn't cover the actual
records (it now goes to end of file).
The write record lock, grabbed by the delete code, is not suppressed
by the allrecord lock. This is now bad: it causes us to punch a hole
in the allrecord lock when we release the write record lock. Make this
consistent: *no* record locks of any kind when the allrecord lock is
taken.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were previously inconsistent with our "global" lock: the
transaction code grabbed it from FREELIST_TOP to end of file, and the
rest of the code grabbed it from FREELIST_TOP to end of the hash
chains. Change it to always grab to end of file for simplicity and
so we can merge the two.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was redundant before this patch series: it mirrored num_lockrecs
exactly. It still does.
Also, skip useless branch when locks == 1: unconditional assignment is
cheaper anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is pure overhead, but it centralizes the locking. Realloc (esp. as
most implementations are lazy) is fast compared to the fnctl anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use our newly-generic nested lock tracking for the active lock.
Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This never nests, so it's overkill, but it centralizes the locking into
lock.c and removes the ugly flag in the transaction code to track whether
we have the lock or not.
Note that we have a temporary hack so this places a real lock, despite
the fact that we are in a transaction.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than a boutique lock and a separate nest count, use our
newly-generic nested lock tracking for the transaction lock.
Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Factor out two loops which find locks; we are going to introduce a couple
more so a helper makes sense.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Move locking intelligence back into lock.c, rather than open-coding the
lock release in transaction.c.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In many places we check whether locks are held: add a helper to do this.
The _tdb_lockall() case has already checked for the allrecord lock, so
the extra work done by tdb_have_extra_locks() is merely redundant.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
tdb_transaction_lock() and tdb_transaction_unlock() do nothing if we
hold the allrecord lock. However, the two locks don't overlap, so
this is wrong.
This simplification makes the transaction lock a straight-forward nested
lock.
There are two callers for these functions:
1) The transaction code, which already makes sure the allrecord_lock
isn't held.
2) The traverse code, which wants to stop transactions whether it has the
allrecord lock or not. There have been deadlocks here before, however
this should not bring them back (I hope!)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because fcntl locks don't nest, we track them in the tdb->lockrecs array
and only place/release them when the count goes to 1/0. We only do this
for record locks, so we simply place the list number (or -1 for the free
list) in the structure.
To generalize this:
1) Put the offset rather than list number in struct tdb_lock_type.
2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the
allrecord check out to the callers (except the mark case which doesn't
care).
3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and
move the allrecord out to the callers (except mark again).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The word global is overloaded in tdb. The global_lock inside struct
tdb_context is used to indicate we hold a lock across all the chains.
Rename it to allrecord_lock.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The word global is overloaded in tdb. The GLOBAL_LOCK offset is used at
open time to serialize initialization (and by the transaction code to block
open).
Rename it to OPEN_LOCK.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now tdb_open() calls tdb_transaction_cancel() instead of
_tdb_transaction_cancel, we can make it static.
Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>
This is taken from the CCAN code base: rather than using tdb_brlock for
locking and unlocking, we split it into brlock and brunlock functions.
For extra debugging information, brunlock says what kind of lock it is
unlocking (even though fnctl locks don't need this). This requires an
extra argument to tdb_transaction_unlock() so we know whether the
lock was upgraded to a write lock or not.
We also use a "flags" argument tdb_brlock:
1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW).
2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype.
3) TDB_LOCK_PROBE replaces the "probe" argument.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If a process (or the machine) dies after just after writing the
recovery head (pointing at the end of file), the recovery record will filled
with 0x42. This will not invoke a recovery on open, since rec.magic
!= TDB_RECOVERY_MAGIC.
Unfortunately, the first transaction commit will happily reuse that
area: tdb_recovery_allocate() doesn't check the magic. The recovery
record has length 0x42424242, and it writes that back into the
now-valid-looking transaction header) for the next comer (which
happens to be tdb_wipe_all in my tests).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This should make it easier to keep all release scripts alined as it will reduce
the difference between them to ideally a few variables
Also moves the tdb script in the scripts directory.
There was a bug in tdb where the
tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);
(ending the transaction-"mutex") was done before the
/* remove the recovery marker */
This means that when a transaction is committed there is a window where another
opener of the file sees the transaction marker while the transaction committer
is still fully functional and working on it. This led to transaction being
rolled back by that second opener of the file while transaction_commit() gave
no error to the caller.
This patch moves the F_UNLCK to after the recovery marker was removed, closing
this window.
We need to keep TDB_ALLOW_NESTING as default behavior,
so that existing code continues to work.
However we may change the default together with a major version
number change in future.
metze
Make the default be that transaction is not allowed and any attempt to create a nested transaction will fail with TDB_ERR_NESTING.
If an application can cope with transaction nesting and the implicit
semantics of tdb_transaction_commit(), it can enable transaction nesting
by using the TDB_ALLOW_NESTING flag.
(cherry picked from ctdb commit 3e49e41c21eb8c53084aa8cc7fd3557bdd8eb7b6)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
While studying tdb, I've noticed a couple of mismatches between readme
and actual code:
- tdb_open_ex changed it's log_fn argument to log_ctx
- there is now no tdb_update(), which it seems was transformed into
non-exported tdb_update_hash()
There were other mismatches, but I don't remember them now, sorry.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The reason I do it is that when using older python-tdb as shipped in
Debian Lenny, python interpreter crashes on this test:
(gdb) bt
#0 0xb7f8c424 in __kernel_vsyscall ()
#1 0xb7df5640 in raise () from /lib/i686/cmov/libc.so.6
#2 0xb7df7018 in abort () from /lib/i686/cmov/libc.so.6
#3 0xb7e3234d in __libc_message () from /lib/i686/cmov/libc.so.6
#4 0xb7e38624 in malloc_printerr () from /lib/i686/cmov/libc.so.6
#5 0xb7e3a826 in free () from /lib/i686/cmov/libc.so.6
#6 0xb7b39c84 in tdb_close () from /usr/lib/libtdb.so.1
#7 0xb7b43e14 in ?? () from /var/lib/python-support/python2.5/_tdb.so
#8 0x0a038d08 in ?? ()
#9 0x00000000 in ?? ()
master's pytdb does not (we have a check for self->closed in obj_close()),
but still...
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
So that erroneous double tdb_close() calls do not try to close() same
fd again. This is like SAFE_FREE() but for fd.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We no longer use swig for pytdb, so there is no need for swig make
rules. Also pytdb.c header should be updated.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
ctdb wants a quick way to detect corrupt tdbs; particularly, tdbs with
loops in their hash chains. tdb_check() provides this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It was a regrettable hack which I used to reduce line count in tdb; in fact it caused confusion as can be seen in this patch.
In particular, ecode now needs to be set before TDB_LOG anyway, and having it exposed in
the header is useless (the struct tdb_context isn't defined, so it's doubly useless).
Also, we should never set errno, as io.c was doing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When TDB_TRACE is defined (in tdb_private.h), verbose tracing of tdb operations is enabled.
This can be replayed using "replay_trace" from http://ccan.ozlabs.org/info/tdb.
The majority of this patch comes from moving internal functions to _<funcname> to
avoid double-tracing. There should be no additional overhead for the normal (!TDB_TRACE)
case.
Note that the verbose traces compress really well with rzip.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There was a race condition that caused the torture.tdb to be left in a
state that needed recovery. The torture code thought that any message
from the tdb code was an error, so the "recovered" message, which is a
TDB_DEBUG_TRACE message, marked the run as being an error when it
isn't.
We previously only allowed a commit to happen after a prepare
commit. It is in fact safe to allow reads between a prepare and a
commit, and the s4 replication code can make use of that, so allow it.
USAGE: abi_checks.sh LIBRARY_NAME header1 [header2 ...]
This creates symbol signature lists using the mksyms and mksigs scripts
and compares them with the checked in lists.
Michael
This produces output like the output gcc produces when
invoked with the -aux-info switch.
Run like this: cat include/tdb.h | ./script/mksigs.pl
This simple parser is probably too coarse to handle all
possible header files, but it treats tdb.h correctly...
Michael
over the 2G offset on systems which support 64 bit file offsets. This fixes
that case.
On systems with 32 bit offsets, expansion and fcntl locking on these records
will fail anyway. SAMBA already does '#define _FILE_OFFSET_BITS 64' in
config.h (on my 32-bit x86 Linux system at least) to get 64 bit file offsets.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The flags are user-visible, via tdb_get_flags/add_flags/remove_flags.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
This version just wraps the reopen code, so we still re-grab the lock and do
the normal sanity checks.
The reason we do this at all is to avoid global fd limits, see:
http://forums.fedoraforum.org/showthread.php?t=210393
Note also that this whole reopen concept is fundamentally racy: if the parent
goes away before the child calls tdb_reopen_all, the database can be left
without an active lock and another TDB_CLEAR_IF_FIRST opener will clear it.
A fork_with_tdbs() wrapper could use a pipe to solve this, but it's hardly
elegant (what if there are other independent things which have similar needs?).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
This reverts commit e17df483fb.
tdb_reopen_all also restores the active lock, required for TDB_CLEAR_IF_FIRST.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
54a51839ea "Make tdb transaction lock
recursive (samba version)" was broken: I "cleaned it up" and prevented
it from ever unlocking.
To see the problem:
$ bin/tdbtorture -s 1248142523
tdb_brlock failed (fd=3) at offset 8 rw_type=1 lck_type=14 len=1
tdb_transaction_lock: failed to get transaction lock
tdb_transaction_start failed: Resource deadlock avoided
My testcase relied on the *count* being correct, which it was. Fixing that
now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael Adam <obnox@samba.org>
This patch replaces 6ed27edbcd and
1a416ff13c, which fixed the bug where traversals
inside transactions would release the transaction lock early.
This solution is more general, and solves the more minor symptom that nested
traversals would also release the transaction lock early. (It was also suggestd in
Volker's comment in 6ed27ed).
This patch also applies to ctdb, if the traverse.c part is removed (ctdb's tdb
code never received the previous two fixes).
Tested using the testsuite from ccan (adapted to the samba code). Thanks to
Michael Adam for feedback.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael Adam <obnox@samba.org>
This is a first attempt at exporting symbols only for public functions
We also provide a rudimentary ABI checker that tries to check that
function signatures are not changed by mistake.
Given our use of macros this is not an API checker.
It's all based on tdb.h contents and the gcc -aux-info option
This greatly reduces the fragmentation of databases where records
tend to grow slowly by a small amount each time. The case where this
is most seen is the ldb index records. Adding this overallocation
reduced the size of the resulting database by more than 20x when
running a test that adds 10k users.
The idea behind this is to recover from badly fragmented free
lists. Choosing the point where the file expands is fairly arbitrary,
but seems to work well.
During a transaction commit tdb normally uses fsync/msync calls to
make it crash safe. This can be disabled using the TDB_NOSYNC flag,
but it wasn't disabling all the code paths that caused a fsync/msync.
Using tdb_transaction_prepare_commit() gives us 2-phase commits. This
allows us to safely commit across multiple tdb databases at once, with
reasonable transaction semantics
Signed-off-by: tridge@samba.org
The tdb_repack() function repacks a TDB so that it has a single
freelist entry. The file doesn't shrink, but it does remove all
freelist fragmentation. This code originated in the CTDB vacuuming
code, but will now be used in ldb to cope with fragmentation from
re-indexing
tdbbackup was originally written before we had transactions, and it
attempted to use its own fsync() calls to make it safe. Now that we
have transactions we can do it in a much safer (and faster!) fashion