1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-26 10:04:02 +03:00

2832 Commits

Author SHA1 Message Date
Michael Adam
2b29e30df5 One would expect I could spell my name... (cherry picked from samba commit 0d120be36bfc561e3f679d081993ccc6bea2a401)
Signed-off-by: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit efa4a795db7fb2bddaab3969850d1554fc5f4da1)
2009-12-16 08:03:43 +01:00
Michael Adam
6e522eb198 tdb: add script/abi_checks.sh. check for abi changes without gcc magic.
USAGE: abi_checks.sh LIBRARY_NAME header1 [header2 ...]

This creates symbol signature lists using the mksyms and mksigs scripts
and compares them with the checked in lists.

Michael
(cherry picked from samba commit 9636e0d373e75cce7063b710227eaaf53f447c1b)

Signed-off-by: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit 724d71dc838750fff91a45359feeb6e71bf0a4c7)
2009-12-16 08:03:43 +01:00
Michael Adam
52b657756a tdb: add script to extract signatures from header files.
This produces output like the output gcc produces when
invoked with the -aux-info switch.

Run like this: cat include/tdb.h | ./script/mksigs.pl

This simple parser is probably too coarse to handle all
possible header files, but it treats tdb.h correctly...

Michael
(cherry picked from samba commit 0760a04ef9f7d2f3d966017295712769d02b8b9f)

Signed-off-by: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit 141422d9dc24b15b7b8bc7831adab90367a729f7)
2009-12-16 08:03:43 +01:00
Michael Adam
53336a9cb1 tdb: add scripts to extract library symbols (exports file) from headers
Michael
(cherry picked from samba commit 006fd0c43c7c403b8671dfc46e5ee31e92480e1f)

Signed-off-by: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit aed864dceaf6ec1e6e6066a587c708b485901200)
2009-12-16 08:03:43 +01:00
Rusty Russell
eb9d367843 lib/tdb: don't overwrite TDBs with different version numbers.
In future, this may happen, and we don't want to clobber them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 398d0c2929026fccb3409316720a4dcad225ab05)

Signed-off-by: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit eebd467961dad6cfb38c2a5d6e4b4dbf86e55e63)
2009-12-16 08:03:43 +01:00
Jeremy Allison
41d4e2dc7c Add define guards around otherwise unused variable. Jeremy. (cherry picked from samba commit 4fc9f9c3f943cdeb27e37f0ee068cdd0da7cb00c)
Signed-off-by: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit 6f8614de0f20d4c507aecd744d9c3f6545078127)
2009-12-16 08:03:42 +01:00
Rusty Russell
e1217b7bdb There is one signedness issue in tdb which prevents traverses of TDB records over the 2G offset on systems which support 64 bit file offsets. This fixes that case.
On systems with 32 bit offsets, expansion and fcntl locking on these records
will fail anyway.  SAMBA already does '#define _FILE_OFFSET_BITS 64' in
config.h (on my 32-bit x86 Linux system at least) to get 64 bit file offsets.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from samba commit 252f7da702fd0409f7bfff05ef594911ededa32f)

Signed-off-by: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit 2d768f664e6db65b3b7e0c732f33ee2b806892f9)
2009-12-16 08:03:42 +01:00
Ronnie Sahlberg
640c48c844 Revert "cleanup: remove a tunable we no longer use in the eventscripts any more :"
This reverts commit 401f421fa003d9515df15e759b50b56e0c67d69c.

Conflicts:

	include/ctdb_private.h
	server/ctdb_tunables.c

(This used to be ctdb commit b883d19a495a41a22db37f9c2cf6250fee529de0)
2009-12-16 09:51:17 +11:00
Ronnie Sahlberg
fcd16342f6 Merge branch 'trans3'
(This used to be ctdb commit b765e12a5fb87a6121e49b349017b6a961929346)
2009-12-15 21:00:22 +11:00
Ronnie Sahlberg
b3104bd1d0 Author: Rusty Russell <rusty@rustcorp.com.au>
Date:   Tue Dec 15 15:53:30 2009 +1030

    eventscript: hack to avoid overloading valgrind

    Now we fork one child per script, when running under valgrind the
load
    gets quite high.  This is because valgrind does a lot of work after
exit,
    and we don't wait for the children to finish; we start the next one
when
    the child reports status via the pipe.

    This fix is ugly, but simple.

    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 6ed34d5320c39d8a55f2a36ad4c1ab574e0b0796)
2009-12-15 20:56:16 +11:00
Ronnie Sahlberg
842aa60d52 This is a dodgy patch.
I saw once where the master ctdbd logging structure was talloc freed
which caused issues.
So only free the structure if it is NOT the master structure.

This needs to be looked into in more detail.

(This used to be ctdb commit bcf494b81f4277dc75f05faccf0c446bd15f6e2b)
2009-12-15 19:04:52 +11:00
Ronnie Sahlberg
0982299bed Revert "Make fetch_locked more scalable"
This reverts commit 5736e17c139c9a8049e235429aeae0c6c9d0e93d.

(This used to be ctdb commit 3d2d877d877146ca09a28a3a44f4840eb36fd377)
2009-12-15 14:26:28 +11:00
Ronnie Sahlberg
5a7e9900df Merge commit 'obnox/ctdb-wip-trans3' into trans3
(This used to be ctdb commit ac06a0e042e7d024060d6e87a49bda9ccc072c52)
2009-12-15 14:25:55 +11:00
Ronnie Sahlberg
3b53c02e34 add a new test tool that just locks and releases the same record over and over
(This used to be ctdb commit 24767be2eb9aed29704c2a4097bab5466cb6728f)
2009-12-15 12:14:49 +11:00
Ronnie Sahlberg
244bc5cc8f ctdb_fetch requires the number of nodes being specified.
Have it log an error and terminate if thie parameter was omitted

(This used to be ctdb commit 340be0179f55acfff77f8c3c8be958679227bde1)
2009-12-15 11:29:16 +11:00
Ronnie Sahlberg
e2e30df2e9 When setting up the logging, set the event to trigger a read of a log message from a child process as a child of the "log" structure and not the ctdb structure,
or else we can crash if we receive log messages from a child but the log structure has been freed()

(This used to be ctdb commit ea9e39369379939abf6a4076fa2014c10c1a9ad0)
2009-12-15 10:45:18 +11:00
Ronnie Sahlberg
db0d2a1b8f From rusty:
Subject: eventscript: fix spinning at 100% cpu when child exits.

ctdbd was spinning reading 0 from a pipe, as soon as the first
eventscript finishes.

This was caused by the intersection between a78b8ea7168e "Run only one
event for each epoll_wait/select call" and 32cfdc3aec34 "eventscript:
ctdb_fork_with_logging()".  Unavoidable mid-air collision, since both
worked fine and both were developed simultaneously.

When the script exits, we have two pipes open to it: one for any
stdout/stderr for logging (ctdb_log_handler), and one for the result
(ctdb_event_script_handler).  The latter frees everything, including
the log fd and event structure.

We used to get one callback to ctdb_log_handler, which got a harmless
0-length read, then one to ctdb_event_script_handler which cleaned up.
Now we only do one callback per poll, we need the logging function to
clean itself up so we can make process.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 211ea7907e8e96041aa6f7d086551d64d065a8a3)
2009-12-15 10:23:58 +11:00
Ronnie Sahlberg
649ba2631d Rename the tunable EventScriptBanCount to EventScriptTimeoutCount
since we no longer ban nodes when dodgy scripts continue to hang.

We now only mark nodes as unhealthy if monitor events fail or timeout. Never ban.

(This used to be ctdb commit 5c8e56fc7a518e115bceac257867739283cf6a1e)
2009-12-14 15:53:23 +11:00
Ronnie Sahlberg
ed6b5a8c68 cleanup: remove a tunable we no longer use in the eventscripts any more :
EventScriptUnhealthyOnTimeout

(This used to be ctdb commit 401f421fa003d9515df15e759b50b56e0c67d69c)
2009-12-14 15:48:47 +11:00
Rusty Russell
cab8da8dc4 ctdb: don't print OUTPUT: for DISABLED scripts
In other news, did you know ctime() returns a \n-terminated string?

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 1b4e7bb548976b99f122142b040494b6f9911962)
2009-12-14 15:46:49 +11:00
Rusty Russell
784fa9fd8a eventscript: fix monitoring when killed by another script command
Commit c1ba1392fe "eventscript: get rid of ctdb_control_event_script_finished
altogether" was wrong: there is one case where we want to free the script
without transferring their status to last_status.  This happens because we
always kill an running monitor command when we run any other command.

This still isn't quite right (and never was): the callback will be called
with status value 0, which might flip us to HEALTHY if we were unhealthy.
This is conveniently fixed in my next set of patches :)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 0ea0e27d93398df997d3df9d8bf112358af3a4a5)
2009-12-14 15:46:14 +11:00
Ronnie Sahlberg
e76561f544 remove the variable "disable when unhealthy"
there is no rational need for a setting where we permanently mark nodes as disabled everytime an eventscript fails

(This used to be ctdb commit 68a8ee99b128a5ec883600735626bdb3bbc9c503)
2009-12-14 15:40:54 +11:00
Michael Adam
b41d9a2bcc Revert "recovery: add special pull-logic for persistent databases"
This reverts commit 8aef46d2aab3efb322dda51eaa202653cefd5222.

This special recovery logic is wrong now with the transaction rewrite.
The treatment of persistent databases will later be rewritten to use the
database sequence number.

Michael

(This used to be ctdb commit c5a0aef668a63f927d6184612b13ce316eb4a0be)
2009-12-12 00:45:40 +01:00
Volker Lendecke
f6ea3e6bcf Make fetch_locked more scalable
This patch improves the handling of the fetch_lock operation on non-persistent
databases that ctdb clients have to do very frequently.

The normal flow how this goes is the following:

1. Client does a local fetch_lock on the database

2. Client looks if the local node is dmaster.
   If yes, everything is fine
   If no, continue here

3. Client unlocks the local record

4. Client issues a "get me the record" call to ctdbd

5. ctdbd goes out and fetches the dmaster role

6. ctdbd tells the client to retry

7. Client starts over again

The problem is between step 6 and 7: Before the client has had the chance to
retry (i.e. catch the record with a fetch_locked), another node might have come
asking ctdbd to migrate away the record again. This is a real problem, I've
seen >20 loops of this kind in real workloads.

This patch does the following: Whenever ctdb receives a record as result of
step 5, it puts the key on a "holdback list". As long as a key is on this list,
a request to migrate away the dmaster is put on hold. It is the client's duty
to issue the "CTDB_CONTROL_GOTIT" control when it has successfully done step 2
after having asked ctdb to fetch the record. This will release the key from the
"holdback list" and re-issue all dmaster migration requests.

As a safeguard against malicious clients, once a second (default 1000msecs,
tunable "HoldbackCleanupInterval" in milliseconds) ctdbd goes over the list of
held back keys, deletes them and releases all held back migration requests.

(This used to be ctdb commit 5736e17c139c9a8049e235429aeae0c6c9d0e93d)
2009-12-12 00:45:39 +01:00
Volker Lendecke
b664a86bc2 Import "talloc_array_length" from upstream talloc
(This used to be ctdb commit 844aa6300ee4d87561e698001ebc15ac1e455528)
2009-12-12 00:45:39 +01:00
Michael Adam
aea324336c tests: temporarily disable the transaction test tool.
Make it return success for make test.
This is temporarily disabled until the rewrite of the
transaction code (in samba and the daemon) using the global
lock feature has been ported to the ctdb client code.

Michael

(This used to be ctdb commit 78ca29352aa39f4ef4e41096b92d55cb2e0d348a)
2009-12-12 00:45:39 +01:00
Michael Adam
46de365e78 Add a new control CTDB_GET_DB_SEQNUM - fetch a persistent db's sequence number.
Michael

(This used to be ctdb commit a7e3b5fac6b3f5d74473f26eb86c067b35647996)
2009-12-12 00:45:39 +01:00
Michael Adam
8dedde81cd define CTDB_DB_SEQNUM_KEY - used with the new implementation of transactions.
Michael

(This used to be ctdb commit 4b1dbcf0853bdc4832d39a477823ae34f216da52)
2009-12-12 00:45:38 +01:00
Volker Lendecke
9f16f655fa Tiny simplification of ctdb_queue_packet()
(This used to be ctdb commit 1640da1cab7e8b545367824204c82931f3346848)
2009-12-12 00:45:38 +01:00
Volker Lendecke
24d04a3e89 Rename a struct member for clarity
(This used to be ctdb commit 6af5e74a21546d723008d69d6752ebebf898c947)
2009-12-12 00:45:37 +01:00
Michael Adam
faacd5ca79 server: add a new control CTDB_CONTROL_TRANS3_COMMIT
This is a simplified version of the trans2 commit control:
It just rolls out the marshall buffer to all active nodes.

It is the main ctdbd part of the re-implementation of the
persistent transactions. The client code is changed to
take a global lock to start a transactions and store into
the marshal buffer instead of writing to the local tdb
under a local transaction.

The old transaction implementation is going to be
removed in a later commit.

Michael

(This used to be ctdb commit f66428f9d2013080a414404c1ba6117888352fd6)
2009-12-12 00:43:26 +01:00
Ronnie Sahlberg
a8549ef700 From: Volker Lendecke <vl@samba.org>
Date: Wed, 9 Dec 2009 22:45:12 +0100
Subject: [PATCH] Revert an accidential commit

(This used to be ctdb commit af6656f2844d8fd72204a70358c9d589dbe1bd34)
2009-12-10 08:53:55 +11:00
Michael Adam
54b9a49e2e tests: remove the no_trans mode from ctdb_transaction.
Writes without transaction are not possible any more on
persistent databases.

Michael

(This used to be ctdb commit 59f46d7261dfdbdef900bf95dd9eb28ad22a46b2)
2009-12-09 22:04:48 +01:00
Michael Adam
332017925f tests: remove the persistent_unsafe writes test.
This is useless now that persistent write operations without
transaction are forbidden.

Michael

(This used to be ctdb commit b022863d44026c19d5aae54aa485b670bea0540e)
2009-12-09 21:57:00 +01:00
Michael Adam
aa6e42a4ba tests: remove persistent_safe write test.
This is useless now that persistent writes without transactions are forbidden.

Michael

(This used to be ctdb commit 9ac82311d796e1fab31f8de62b8ccc754445093c)
2009-12-09 21:56:59 +01:00
Michael Adam
c32ff2bbb0 test: add test 54_ctdb_transaction_recovery.sh
This is like the 53_ctdb_transaction test, but it additionally
runs a loop with recoveries while the transactions are running.

When called like this, the transaction loops run for 10 minutes:

CTDB_TEST_TIMELIMIT=600 tests/scripts/run_tests tests/simple/54_ctdb_transaction_recovery.sh

The default timelimit is 30 seconds.

Michael

(This used to be ctdb commit 2ff2679e8f3d50ebf735f2c420898a84268bdc95)
2009-12-09 21:56:59 +01:00
Michael Adam
edfc6a8c12 test: get value for --timelimit from environment var CTDB_TEST_TIMELIMIT in transaction test
Michael

(This used to be ctdb commit c13077ca64f6e6569c30ef7fcb044e5711dce1a3)
2009-12-09 21:56:59 +01:00
Michael Adam
c2c9a04cf2 client: lower level of commit retry message WARNING->DEBUG
This can happen frequently when recoveries intercept transactions.

Michael

(This used to be ctdb commit c46adb210e47530488503e20d682d4d182c0fb79)
2009-12-09 21:56:59 +01:00
Michael Adam
97d780bc20 client: lower debug level of transaction-active-retry message to DEBUG
This reduces some noise.

Michael

(This used to be ctdb commit 54d227811753f4a87f1a2c9dc0b1389f5ca2a12f)
2009-12-09 21:56:59 +01:00
Michael Adam
ea65e80223 call: lower the debug message "refusing migration while transction" to lvl INFO
This gets just too noisy on a busy system.
And it is purley informational anyways...

Michael

(This used to be ctdb commit 7f64a00c76203fdf6673c3f862a4bfd17fb848d7)
2009-12-09 21:56:59 +01:00
Volker Lendecke
a0d9bd3c13 Run only one event for each epoll_wait/select call
This might be a bit less efficient, but experience in winbind has shown that
event callbacks can trigger changes in the socket state in very hard to
diagnose ways.

(This used to be ctdb commit a78b8ea7168e5fdb2d62379ad3112008b2748576)
2009-12-10 07:52:16 +11:00
Christian Ambach
47f8c380d2 reduce vacuuming lognoise
syslog.h says:

LOG_NOTICE      5    normal but significant condition
LOG_INFO        6    informational

several vacuuming related logs logged at NOTICE level although I don't see
any real significance, these are just informational messages for me

Signed-off-by: Christian Ambach <christian.ambach@de.ibm.com>

(This used to be ctdb commit 142111983c103e90ccccbe26fd580c4eb28e949f)
2009-12-10 07:33:59 +11:00
Christian Ambach
4269d37ce8 improve time jump logging
add the __location__ macro to the logs to get a better idea
in which loop the problem occured

Signed-off-by: Christian Ambach <christian.ambach@de.ibm.com>

(This used to be ctdb commit dccb549fd6a6e338063699544e52f2a1a6a966b5)
2009-12-10 07:31:04 +11:00
Ronnie Sahlberg
839670253a Merge commit 'rusty/script-report'
(This used to be ctdb commit 6e8b279ed307eccac08386e98510361ba3ab3d36)
2009-12-09 14:26:42 +11:00
Ronnie Sahlberg
50820f9e18 Bond devices can have any name the user configures, so
when checking link status for an interface, first
check if this interface is in fact a bond device
(by the precense of a /proc/net/bonding/IFACE file)
and use that file for checking status.

Othervise assume ib* is an infiniband interface which we donnt know how
to check, or otherwise it is an ethernet interface and ethtool should
hopefully work.

(This used to be ctdb commit 8cc6c5de3d7abb0b72eaa6e769e70963b02d84cb)
2009-12-09 11:33:04 +11:00
Ronnie Sahlberg
3ca3f4c771 make sure to also check that interfaces used for NATGW are ok
and have a link.
if not the node should become unhealthy

(This used to be ctdb commit 03b5bbaae1b53830a4cd20d3079ab8f45ffce923)
2009-12-09 11:13:29 +11:00
Stefan Metzmacher
af170d1a8a events/50.samba: only use wbinfo --ping-dc if available
metze

(This used to be ctdb commit 7b73834ba3ac197cc8a3020c111f9bb2c567e70b)
2009-12-08 07:38:00 +11:00
Rusty Russell
a46c3b4f2a ctdb: scriptstatus can now query non-monitor events
We also no longer return an error before scripts have been run; a special
zero-length data means we have never run the scripts.

"ctdb scriptstatus all" returns all event script results.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 9b90d671581e390e2892d3a68f3ca98d58bef4df)
2009-12-08 01:50:55 +10:30
Rusty Russell
5d99a1a47c eventscript: expost call names and enum
We're going to need this so ctdb can query non-monitor status.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 53bc5ca23ca55a3ac63a440051f16716944a2a51)
2009-12-08 01:47:13 +10:30
Rusty Russell
0dbe76f88f eventscript: lock logging on timeout.
Ronnie suggested this; seems like a very good idea.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 93153bca68926401dc9ae7fd77ed3f17be923344)
2009-12-08 01:32:36 +10:30