1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-25 06:04:04 +03:00

81314 Commits

Author SHA1 Message Date
Rusty Russell
1fe797aada ntdb: put it back into the build.
This doesn't do anything with it yet, just wires it back into the build.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
13ac664a6d libcli: use tdb directly, not tdb_compat.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
316e5e376c lib/tdb_wrap: use tdb directly, not tdb_compat.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
df4a6e8228 ldb: use tdb directly, not tdb_compat.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
6dc02e832a lib/dbwrap: depend directly on tdb, not tdb_compat.
Simple change, as we get rid of tdb_compat in favour of either ntdb directly
or dbwrap.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
2fc3265873 lib/util_tdb: depend directly on tdb, not tdb_compat.
Simple change, as we get rid of tdb_compat in favour of tdb directly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
5ff92d8f7d ntdb: update documentation.
Update the design.lyx file with the latest status and the change in hashing.
Also, refresh and add examples to the TDB_porting.txt file.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
b888bc4316 ntdb: optimize ntdb_fetch.
We access the key on lookup, then access the data in the caller.  It
makes more sense to access both at once.  We also put in a likely()
for the case where the hash is not chained.

Before:
Adding 1000 records: 3644-3724(3675) ns (129656 bytes)
Finding 1000 records: 1596-1696(1622) ns (129656 bytes)
Missing 1000 records: 1409-1525(1452) ns (129656 bytes)
Traversing 1000 records: 1636-1747(1668) ns (129656 bytes)
Deleting 1000 records: 3138-3223(3175) ns (129656 bytes)
Re-adding 1000 records: 3278-3414(3329) ns (129656 bytes)
Appending 1000 records: 5396-5529(5426) ns (253312 bytes)
Churning 1000 records: 9451-10095(9584) ns (253312 bytes)
smbtorture results (--entries=1000)
ntdb speed 183881-191112(188223) ops/sec

After:
Adding 1000 records: 3590-3701(3640) ns (129656 bytes)
Finding 1000 records: 1539-1605(1566) ns (129656 bytes)
Missing 1000 records: 1398-1440(1413) ns (129656 bytes)
Traversing 1000 records: 1629-2015(1710) ns (129656 bytes)
Deleting 1000 records: 3118-3236(3163) ns (129656 bytes)
Re-adding 1000 records: 3235-3355(3275) ns (129656 bytes)
Appending 1000 records: 5335-5444(5385) ns (253312 bytes)
Churning 1000 records: 9350-9955(9494) ns (253312 bytes)
smbtorture results (--entries=1000)
ntdb speed 180559-199981(195106) ops/sec
2012-06-19 05:38:07 +02:00
Rusty Russell
8fdd20b22f ntdb: add -h arg to ntdbrestore
Since our default hashsize is 8192 not 131, we look fat when we convert
near-empty TDBs.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
a941b19e5d ntdb: reduce default hashsize on ntdbtorture.
Just like tdbtorture, having a hashsize of 2 stresses us much more!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
87f871aae3 ntdb: add NTDB_ATTRIBUTE_HASHSIZE
Since we've given up on expansion, let them frob the hashsize again.
We have attributes, so we should use them for optional stuff like
this.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
dd42962878 ntdb: remove hash table trees.
TDB2 started with a top-level hash of 1024 entries, divided into 128
groups of 8 buckets.  When a bucket filled, the 8 bucket group
expanded into pointers into 8 new 64-entry hash tables.  When these
filled, they expanded in turn, etc.

It's a nice idea to automatically expand the hash tables, but it
doesn't pay off.  Remove it for NTDB.

1) It only beats TDB performance when the database is huge and the
   TDB hashsize is small.  We are about 20% slower on medium-size
   databases (1000 to 10000 records), worse on really small ones.
2) Since we're 64 bits, our hash tables are already twice as expensive
   as TDB.
3) Since our hash function is good, it means that all groups tend to
   fill at the same time, meaning the hash enlarges by a factor of 128
   all at once, leading to a very large database at that point.
4) Our efficiency would improve if we enlarged the top level, but
   that makes our minimum db size even worse: it's already over 8k,
   and jumps to 1M after about 1000 entries!
5) Making the sub group size larger gives a shallower tree, which
   performs better, but makes the "hash explosion" problem worse.
6) The code is complicated, having to handle delete and reshuffling
   groups of hash buckets, and expansion of buckets.
7) We have to handle the case where all the records somehow end up with
   the same hash value, which requires special code to chain records for
   that case.

On the other hand, it would be nice if we didn't degrade as badly as
TDB does when the hash chains get long.

This patch removes the hash-growing code, but instead of chaining like
TDB does when a bucket fills, we point the bucket to an array of
record pointers.  Since each on-disk NTDB pointer contains some hash
bits from the record (we steal the upper 8 bits of the offset), 99.5%
of the time we don't need to load the record to determine if it
matches.  This makes an array of offsets much more cache-friendly than
a linked list.

Here are the times (in ns) for tdb_store of N records, tdb_store of N
records the second time, and a fetch of all N records.  I've also
included the final database size and the smbtorture local.[n]tdb_speed
results.

Benchmark details:
1) Compiled with -O2.
2) assert() was disabled in TDB2 and NTDB.
3) The "optimize fetch" patch was applied to NTDB.

10 runs, using tmpfs (otherwise massive swapping as db hits ~30M,
despite plenty of RAM).

				Insert	Re-ins	Fetch	Size	dbspeed
				(nsec)	(nsec)	(nsec)	(Kb)	(ops/sec)
TDB (10000 hashsize):	
	100 records:		 3882	 3320	1609	   53	203204
	1000 records:		 3651	 3281	1571	  115	218021
	10000 records:		 3404	 3326	1595	  880	202874
	100000 records:		 4317	 3825	2097	 8262	126811
	1000000 records:	11568	11578	9320	77005	 25046

TDB2 (1024 hashsize, expandable):
	100 records:		 3867	 3329	1699	   17	187100	
	1000 records:		 4040	 3249	1639	  154	186255
	10000 records:		 4143	 3300	1695	 1226	185110
	100000 records:		 4481	 3425	1800	17848	163483
	1000000 records:	 4055	 3534	1878   106386	160774

NTDB (8192 hashsize)
	100 records:		 4259	 3376	1692	   82	190852
	1000 records:		 3640	 3275	1566	  130	195106
	10000 records:		 4337	 3438	1614	  773	188362
	100000 records:		 4750	 5165	1746	 9001	169197
	1000000 records:	 4897	 5180	2341	83838	121901

Analysis:
	1) TDB wins on small databases, beating TDB2 by ~15%, NTDB by ~10%.
	2) TDB starts to lose when hash chains get 10 long (fetch 10% slower
	   than TDB2/NTDB).
	3) TDB does horribly when hash chains get 100 long (fetch 4x slower
	   than NTDB, 5x slower than TDB2, insert about 2-3x slower).
	4) TDB2 databases are 40% larger than TDB1.  NTDB is about 15% larger
	   than TDB1
2012-06-19 05:38:07 +02:00
Rusty Russell
f986554b1e ntdb: special accessor functions for read/write of an offset.
We also split off the NTDB_CONVERT case (where the ntdb is of a
different endian) into its own io function.

NTDB speed:
Adding 10000 records: 3894-9951(8553) ns (815528 bytes)
Finding 10000 records: 1644-4294(3580) ns (815528 bytes)
Missing 10000 records: 1497-4018(3303) ns (815528 bytes)
Traversing 10000 records: 1585-4225(3505) ns (815528 bytes)
Deleting 10000 records: 3088-8154(6927) ns (815528 bytes)
Re-adding 10000 records: 3192-8308(7089) ns (815528 bytes)
Appending 10000 records: 5187-13307(11365) ns (1274312 bytes)
Churning 10000 records: 6772-17567(15078) ns (1274312 bytes)
NTDB speed in transaction:
Adding 10000 records: 1602-2404(2214) ns (815528 bytes)
Finding 10000 records: 456-871(778) ns (815528 bytes)
Missing 10000 records: 393-522(503) ns (815528 bytes)
Traversing 10000 records: 729-1015(945) ns (815528 bytes)
Deleting 10000 records: 1065-1476(1374) ns (815528 bytes)
Re-adding 10000 records: 1397-1930(1819) ns (815528 bytes)
Appending 10000 records: 2927-3351(3184) ns (1274312 bytes)
Churning 10000 records: 3921-4697(4378) ns (1274312 bytes)
smbtorture results:
ntdb speed 86581-191518(175666) ops/sec
Applying patch..increase-top-level.patch
2012-06-19 05:38:07 +02:00
Rusty Russell
9133a98c44 ntdb: inline oob check
The simple "is it in range" check can be inline; complex cases can be
handed through to the normal or transaction handler.

NTDB speed:
Adding 10000 records: 4111-9983(9149) ns (815528 bytes)
Finding 10000 records: 1667-4464(3810) ns (815528 bytes)
Missing 10000 records: 1511-3992(3546) ns (815528 bytes)
Traversing 10000 records: 1698-4254(3724) ns (815528 bytes)
Deleting 10000 records: 3608-7998(7358) ns (815528 bytes)
Re-adding 10000 records: 3259-8504(7805) ns (815528 bytes)
Appending 10000 records: 5393-13579(12356) ns (1274312 bytes)
Churning 10000 records: 6966-17813(16136) ns (1274312 bytes)
NTDB speed in transaction:
Adding 10000 records: 916-2230(2004) ns (815528 bytes)
Finding 10000 records: 330-866(770) ns (815528 bytes)
Missing 10000 records: 196-520(471) ns (815528 bytes)
Traversing 10000 records: 356-879(800) ns (815528 bytes)
Deleting 10000 records: 505-1267(1108) ns (815528 bytes)
Re-adding 10000 records: 658-1681(1477) ns (815528 bytes)
Appending 10000 records: 1088-2827(2498) ns (1274312 bytes)
Churning 10000 records: 1636-4267(3785) ns (1274312 bytes)
smbtorture results:
ntdb speed 85588-189430(157110) ops/sec
2012-06-19 05:38:07 +02:00
Rusty Russell
d938c0b591 ntdb: allocator attribute.
This is designed to allow us to make ntdb_context (and NTDB_DATA returned
from ntdb_fetch) a talloc pointer.  But it can also be used for any other
alternate allocator.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:07 +02:00
Rusty Russell
6d5a3e1602 ntdb: still prepare recovery area with NTDB_NOSYNC.
NTDB_NOSYNC now just prevents the fsync/msync calls, which speeds
testing while still providing full coverage.  It also provides safety
against processes dying during transaction commit (though obviously,
not against the machine dying).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
89b0d5ac6c ntdb: simply disallow NULL names.
TDB allows this for internal databases, but it's a bad idea, since the
name is useful for logging.

They're a hassle to deal with, and we'd just end up putting "unnamed"
in there, so let the user deal with it.  If they don't, they get an
informative core dump.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
7fae6c44e2 ntdb: reduce transaction pagesize from 64k to 16k.
The performance numbers for transaction pagesize are indeterminate:
larger pagesizes means a smaller transaction array, and a better
chance of having a contiguous record (more efficient for
ntdb_parse_record and some internal operations inside a transaction).

On the other hand, large pagesize means more I/O even if we change a
few bytes.

But it also controls the multiple by which we will enlarge the file,
and hence the minimum db size.  It's 4k for tdb1, but 16k seems
reasonable in these modern times.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
35381cad1f ntdb: remove last block transactoin logic.
Now our database is always a multiple of NTDB_PGSIZE, we can remove the
special handling for the last block.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
db2508840d ntdb: create initial database to be multiple of NTDB_PGSIZE.
As copied from tdb1, there is logic in the transaction code to handle
a non-PGSIZE multiple db, but in fact this only happens for a
completely unused database: as soon as we add anything to it, it is
expanded to a NTDB_PGSIZE multiple.

If we create the database with a free record which pads it out to
NTDB_PGSIZE, we can remove this last-page-is-different logic.

Of course, the fake ntdbs we create in our tests now also need to be
multiples of NTDB_PGSIZE, so we change some numbers there too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
9396757676 ntdb: make sure file is always a multiple of PAGESIZE (now NTDB_PGSIZE)
ntdb uses tdb's transaction code, and it has an undocumented but implicit
assumption: that the transaction recovery area is always aligned to the
transaction pagesize.  This means that no block will overlap the recovery
area.

This is maintained by rounding the size of the database up, so do the same
for ntdb.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
dd4eed4759 ntdb: fix recovery data write.
We were missing the last few bytes.  Found by 100 runs of ntdbtorture
-t -k.

The transaction test code didn't catch this, because usually those
last few bytes are irrelevant to the actual contents of the database.
We fix the test.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
40cf08823d ntdb: enhance external-helper test code.
Our external test helper is a bit primitive when it comes to doing STORE or
FETCH commands: let us specify the data we expect, instead of assuming it's
the same as the key.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
3bccb610c1 ntdb: use NTDB_LOG_WARNING level for failed open() without O_CREAT.
This is a fairly common pattern in Samba, and if we log an error on
every open it spams the logs.  On the other hand, other errors are
potentially more serious, so we still use NTDB_LOG_ERROR on them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
c7273629a2 ccan: remove bogus debug print.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
8a7c535db1 ntdb: make fork test more thorough.
We document that the child of a fork() can do a brunlock() if the parent
does a brlock: we should not log an error when they do this.

Also, test the case where we fork() and return inside a parse function
(which is allowed).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
d48f6f884b ntdb: print \n at end of log messages in tests.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
5027f9cd02 ntdb: reduce race between creating file and getting open lock.
In tdb, we grab the open lock immediately after we open the file.  In
ntdb, we usually did some work first.  tdbtorture managed to get in
before the creator grabbed the lock:

	testing with 3 processes, 5000 loops, seed=1338246020
	ntdb:torture.ntdb:IO Error:ntdb_open: torture.ntdb is not a ntdb file
	29023:torture.ntdb:db open failed

At cost of a little duplicated code, we can reduce the race.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
1765c0f9ba ntdb: catch any valgrind errors in test
Make --valgrind and --valgrind-log options work!

Amitay figured this out!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
fc9b8ee790 ntdb: catch any valgrind errors in test
We need --error-exitcode=, otherwise valgrind errors don't cause the
test to fail.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
f5e9ed1ea9 ntdb: remove ntdb_error()
It was a hack to make compatibility easier.  Since we're not doing that,
it can go away: all callers must use the return value now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Rusty Russell
16cc345d4f TDB2: Goodbye TDB2, Hello NTDB.
This renames everything from tdb2 to ntdb: importantly, we no longer
use the tdb_ namespace, so you can link against both ntdb and tdb if
you want to.

This also enables building of standalone ntdb by the autobuild script.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:06 +02:00
Kirill Smelkov
76758b9767 tdb2: Fix typo in TDB1_porting.txt
Judging by code it's tdb1, where you needed to free old key's dptr
manually.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:05 +02:00
Rusty Russell
c3dcdf08f3 TDB2: more internal cleanups after TDB1 compatibility removal.
This eliminates the separate tdb2 substructure, and makes some
tdb1-required functions static.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:05 +02:00
Rusty Russell
cab6e11678 TDB2: remove TDB1 compatibility.
This rips out all the TDB1 compatibility from tdb2.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:05 +02:00
Rusty Russell
6244f668a3 TDB2: make SAMBA use tdb1 again for the moment.
Otherwise the following surgery will break the SAMBA build and testsuite.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:05 +02:00
Rusty Russell
5bad913938 ccan: check for err.h ourselves
Heimdal does this, but that doesn't help the autoconf build or the standalone
libntdb build.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-19 05:38:05 +02:00
Jelmer Vernooij
85b8439d4a WHATSNEW: Fix typo.
"dcerpc endpoint services" -> "dcerpc endpoint servers"

Autobuild-User(master): Jelmer Vernooij <jelmer@samba.org>
Autobuild-Date(master): Tue Jun 19 04:40:12 CEST 2012 on sn-devel-104
2012-06-19 04:40:12 +02:00
Jelmer Vernooij
bf5934ca1b tdb/wscript: Remove unecessary semicolons. 2012-06-19 02:43:23 +02:00
Stefan Metzmacher
59daf91f39 wafsamba/irixcc: add '-c99' option to cc
Lets see if this fixes the build on IRIX.

metze

Autobuild-User(master): Stefan Metzmacher <metze@samba.org>
Autobuild-Date(master): Tue Jun 19 02:42:21 CEST 2012 on sn-devel-104
2012-06-19 02:42:20 +02:00
Björn Jacke
26e868bfed Revert "s3: temporary hack to make the waf build work withouth autotools being required"
This reverts commit f1becfa27b6b4e35541e6df0cafdec0ad47d2e3f. The hack was
actually only required due to a configuration issue in our buildfarm scripts.

Autobuild-User(master): Björn Jacke <bj@sernet.de>
Autobuild-Date(master): Mon Jun 18 20:07:08 CEST 2012 on sn-devel-104
2012-06-18 20:07:08 +02:00
Stefan Metzmacher
63c2784076 selftest/flapping: samba4.nss.test is also flakey for s3member
[1426/1518 in 1h24m58s] samba4.nss.test using winbind(s3member)
UNEXPECTED(failure): samba4.nss.test using winbind(s3member).run nsstest(s3member)
REASON: _StringException: _StringException: ERROR setpwent: NSS_STATUS=-1  1 (nss_errno=0)
ERROR getpwent: NSS_STATUS=-1  1 (nss_errno=0)
ERROR endpwent: NSS_STATUS=-1  1 (nss_errno=0)
ERROR setgrent: NSS_STATUS=-1  1 (nss_errno=0)
ERROR getgrent: NSS_STATUS=-1  1 (nss_errno=0)
ERROR endgrent: NSS_STATUS=-1  1 (nss_errno=0)
ERROR Non existent user gave error -1
ERROR Non existent uid gave error -1
ERROR Non existent group gave error -1
ERROR Non existent gid gave error -1
total_errors=10

metze

Autobuild-User(master): Stefan Metzmacher <metze@samba.org>
Autobuild-Date(master): Mon Jun 18 17:59:25 CEST 2012 on sn-devel-104
2012-06-18 17:59:24 +02:00
Stefan Metzmacher
333cee7484 s3:autoconf: add -Iautoconf -Iautoconf/source3 at configure stage
There're some configure tests which require this.

metze
2012-06-18 15:26:45 +02:00
Stefan Metzmacher
a146f0708e s3:Makefile.in: remove pidl generated files with 'make realdistclean'
metze
2012-06-18 15:26:44 +02:00
Stefan Metzmacher
9cbea1f6ed s3:Makefile.in: fix 'make realdistclean' after moving generated files to autoconf/
metze
2012-06-18 15:26:44 +02:00
Stefan Metzmacher
071dfb42f2 s3:Makefile.in: fix 'make clean' after moving generated files to autoconf/
metze
2012-06-18 15:26:43 +02:00
Stefan Metzmacher
9522e853c2 s3:autogen.sh: fix autoconf/lib/param/param_proto.h location
metze
2012-06-18 15:26:43 +02:00
Stefan Metzmacher
da76cda93f lib/param: add missing prototype of lpcfg_parm_long()
metze
2012-06-18 15:26:42 +02:00
Michael Adam
d4912edea6 s3:autoconf-build: build the idmap backends tdb2, rid, and hash by default (shared)
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Mon Jun 18 13:38:50 CEST 2012 on sn-devel-104
2012-06-18 13:38:50 +02:00
Michael Adam
f5b40b1bdd s3:waf-build: build the idmap backends tdb2, rid, and hash by default (shared) 2012-06-18 11:44:50 +02:00