1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-07 17:18:11 +03:00
Commit Graph

3426 Commits

Author SHA1 Message Date
Ronnie Sahlberg
c95f4258d8 Add a new event "ipreallocated"
This is called everytime a reallocation is performed.

    While STARTRECOVERY/RECOVERED events are only called when
    we do ipreallocation as part of a full database/cluster recovery,
    this new event can be used to trigger on when we just do a light
    failover due to a node becomming unhealthy.

    I.e. situations where we do a failover but we do not perform a full
    cluster recovery.

    Use this to trigger for natgw so we select a new natgw master node
    when failover happens and not just when cluster rebuilds happen.

(This used to be ctdb commit 7f4c591388adae20e98984001385cba26598ec67)
2010-08-30 18:09:30 +10:00
Martin Schwenke
46b9110f88 Test suite: Make NFS tickle test more flexible.
Use onnode any where possible rather than a fixed node.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 51561720d2b4db5b307da3d410661075e2a6c3ca)
2010-08-27 11:43:50 +10:00
Martin Schwenke
9878f8cbc2 Test suite: Fix NFS tickle test.
We now kill ctdbd on the test node instead of disabling it.  This
ensures that the only tickles we see will come from the takeover node.

We also sleep for TickleUpdateInterval before checking for asking ctdb
about the tickles.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 48cd8325c070f6942aa13a25269021e4c8ed188f)
2010-08-27 11:40:44 +10:00
Martin Schwenke
68717f689d Test suite: Tweak NFS tickle test.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c32ffd203e42a39010ce2d6e98253e8e48de515a)
2010-08-26 17:56:50 +10:00
Martin Schwenke
d7b169be9a Test suite: Fix typos in NFS tickle test.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c35d3e6341bc4e288393efa429b68bf6568b9b11)
2010-08-26 15:50:35 +10:00
Martin Schwenke
9235dc727a Test suite: NFS tickle test uses gettickles if events.d/61.nfstickle missing.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4763ccbfeaedd0fd953dbeda17ef9af41386688b)
2010-08-26 15:28:19 +10:00
Martin Schwenke
a104d1d823 NFS tickles: use addtickle/deltickle instead of shared tickle directory.
This adds a new function update_tickles() that tracks tickles for a
given port using the new ctdb addtickle/deltickle commands.  This
function is used in events.d/60.nfs to handle NFS tickles.

events.d/61.nfstickle is removed.  The
/proc/sys/net/ipv4/tcp_tw_recycle setup is also moved to
events.d/60.nfs.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit dca4c4ebf3c35f8db3ae208efb7a83abbf726ed6)
2010-08-26 14:59:59 +10:00
Martin Schwenke
0d2c554d5f Test suite: in the test eventscript, run "ctdb" not "$CTDB".
It is too hard to do anything else...

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 08b636b500855e38e708e6963d8e63ded97c25ec)
2010-08-26 14:04:03 +10:00
Martin Schwenke
6d15082045 Merge branch 'master' of git://git.samba.org/sahlberg/ctdb
(This used to be ctdb commit 090d9c8443cfa13d45f8c5d2845aea5aa9f7251d)
2010-08-26 11:06:57 +10:00
Ronnie Sahlberg
3edec07807 Add a configuration database, implemented as a persistent database.
This database can be used, as an option, to store
the public address assignment instead of editing the /etc/ctdb/public-addresses file manually.

This configuration is stored in one record per key, with a key-name of
public-addresses:node#<pnn>
where <pnn> is the node number.

The content of this record is the same syntax as the /etc/ctdb/public-addresses file.

When ctdbd starts, if this key exist and contains data. It is extracted from the database and compared with the normal file /etc/ctdb/public-addresses.

If the content differs, the config database "wins" and is used to overwrite/update the /etc/ctdb/public-addresses file, after which ctdbd is restarted.

The main benefit with this option is that it can be used to update the public address configuration for nodes that are offline/unreachable by updating their configuration in the persistent database.
Once the offline node is available again, it will resync its databases with the rest of the cluster, find out that the config has changed, apply the changes and restart ctdbd automatically.

The command to store the public address configuration for a node into the persistent database is :

ctdb pstore config.tdb public-addresses:node#<pnn> <filename>

where <pnn> is the node# we wish to update the config for, and <filename> is a file containing the new content for  that nodes public address configuration.

(This used to be ctdb commit 292d7435a360efd7f15a7a99f658a605e07c0a81)
2010-08-25 11:49:56 +10:00
Ronnie Sahlberg
55c619f072 the tfetch command can be used without the daemon running, so flag it as such.
fix a couple of incorrect settings for "auto-all" for a few of the commands as well.

(This used to be ctdb commit 9999771105d7105efaa232fe2842e21e66f78706)
2010-08-25 11:11:12 +10:00
Ronnie Sahlberg
018063b8eb add a new command "ctdb tfetch" that can read a record straight out of the
tdb file.

the command automatically strips off the initial ctdb header off the record so it can only be used on ctdb managed tdb files, not on normal tdb files.

(This used to be ctdb commit c3a816e5174abefb5155f65d8faad7b1e831e481)
2010-08-25 10:56:02 +10:00
Ronnie Sahlberg
f75b984b71 When "ctdb pfetch" creates a new file, make sure we set some initial sane mode bits
(This used to be ctdb commit 87160c91bfd87e8b9c510dacbf00e5aa481d2305)
2010-08-25 10:35:12 +10:00
Ronnie Sahlberg
ac335e3e5d run the "init" event before we freeze the databases
so that we can read from databases during this event

(This used to be ctdb commit 6c93bf5a1219617bfb39b093aee3200c74c2c61a)
2010-08-25 08:35:24 +10:00
Ronnie Sahlberg
4c5a4015f3 change "ctdb pfetch" to take an optional third argument
as a file to store the record in.

(This used to be ctdb commit 6d7e62f5401f0647a519fe0b74ec628418e33231)
2010-08-25 08:07:47 +10:00
Ronnie Sahlberg
a8db1adcd6 add a command to write a record to a persistent database
"ctdb pstore <db> <key> <file containing possibly binary data>"

(This used to be ctdb commit 14184ab7c80a3ef16c54b4ab168fd635b7add445)
2010-08-24 14:00:18 +10:00
Ronnie Sahlberg
4da818504a get rid of two compiler warnings
(This used to be ctdb commit 0865f0e6ef671396aa862f6a79a48a4891d72122)
2010-08-24 14:00:10 +10:00
Ronnie Sahlberg
401732a56b Add a command "ctdb pfetch <db> <record>" to read a record from
a persistent database.

(This used to be ctdb commit 3bef831b96ce8b40457ed4de527f0d62fa6a5b00)
2010-08-24 14:00:02 +10:00
Martin Schwenke
f2e2abbaad Merge branch 'master' of git://git.samba.org/sahlberg/ctdb
(This used to be ctdb commit 718ddc2264c28185fcddbc9cb0c7137d198a43a7)
2010-08-24 11:53:29 +10:00
Ronnie Sahlberg
ccdb91a169 move the directives to build the devel file to the end of the specfile
so that the dependencies are right
or else the dependencies all end up in the devel package and not the main
ctdb package

(This used to be ctdb commit 6e4347eb8e62c28987820f6e58626271c900b011)
2010-08-23 16:00:19 +10:00
Ronnie Sahlberg
e040a966af Dont set next_interval to 0.
This can cause ctdbd to spin at 100% in the eventsystem,
creating a timed event that will immediately trigger again
and again.

On uniprocessors this cause the eventscript we are actually waiting for to
basically become cpu starved and never complete.

(This used to be ctdb commit 92c8408fba957a8ded13f7e285da290502735234)
2010-08-20 15:00:45 +10:00
Ronnie Sahlberg
1ef66379d7 ctdb ip is very busy.
revert the defauls case back to only showing the ip and node
and only display the extra info if -v verbose output is requested

(This used to be ctdb commit 6488651aa7e105c57324f4a300760a010d098fbb)
2010-08-20 11:38:34 +10:00
Ronnie Sahlberg
08a5b0c7c5 add a new commandline flag -v to enable verbose output
(This used to be ctdb commit 96dd9f40f9464c3d9de98f1323568724a1e31dc9)
2010-08-20 11:28:24 +10:00
Ronnie Sahlberg
388d18cc93 make it possible to "ctdb gettickle" to only list tickles for a certain
port.

Default is to continue to show all tickles, but if a second argument
is given, only tickles for that port will be shown.

(This used to be ctdb commit 5b985eb2cbbb92bf6ccfcacd633d793bcd4e3ec1)
2010-08-20 11:25:12 +10:00
Ronnie Sahlberg
7229922d97 Dont use the deprecated talloc_append_string()
Use talloc_strdup_append() instead

(This used to be ctdb commit e41581347af5ef26d429d38ed48fa46244f0dbfc)
2010-08-20 11:03:17 +10:00
Ronnie Sahlberg
32a2297b20 We need the deprecated talloc_append_string() for now
so set the TALLOC_DEPRECATED sympol to allow use of this call
from ctdb_client.c

(This used to be ctdb commit 3afa5d945a56952a7f211af068d671945de960e5)
2010-08-19 14:48:19 +10:00
Ronnie Sahlberg
2e8aac6689 Merge commit 'rusty/ports-from-1.0.112' into foo
(This used to be ctdb commit 13e58d92f5f1723e850a82ae030d0ca57e89b1ee)
2010-08-19 13:17:56 +10:00
Ronnie Sahlberg
4c05f1900c Merge commit 'rusty/vacuum-fix-master'
(This used to be ctdb commit dc301b324d2c14a2425a965c076113c4fe97903e)
2010-08-19 13:16:35 +10:00
Ronnie Sahlberg
729f1ddea0 On RHEL, "service nfs stop;service nfs start" and "service nfs restart"
sometimes (very rarely) fails to restart the service.

    Add a function to restart NFSd on SLES and RHEL-like systems.

    If we detect the system is unhealthy due to kNFSd not running,
    try to restart the service again "service nfs restart" and
    hope for the best.

CQ1019372

(This used to be ctdb commit 25c4ce7e919f13226219f036bcffd2be76b2f06c)
2010-08-19 07:18:22 +10:00
Ronnie Sahlberg
31126b2ef0 Add machinereadable output for the "ctgdb gettickles <ip>" command
(This used to be ctdb commit c3eb53509331045074579468d94ed7e31101bba4)
2010-08-18 14:37:16 +10:00
Ronnie Sahlberg
5aa5f3e7bf Remove the structure ctdb_control_tcp_vnn since this is identical to the structure ctdb_tcp_connection.
Add a new "ctdb deltickle" command to delete tickles from the database.
This can ONLY be used for tickles created by "ctdb addtickle".

Push any "addtickle/deltickle" updates to other nodes every TickleUpdateInterval seconds'

(This used to be ctdb commit acded034e2f0dcae4c2c9e54e16a001caf23caec)
2010-08-18 12:36:03 +10:00
Rusty Russell
9fbb191b78 logging: give a unique logging name to each forked child.
This means we can distinguish which child is logging, esp. via syslog where we have no pid.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 68b3761a0874429b90731741f0531f76dcfbb081)
2010-08-18 11:46:32 +09:30
Rusty Russell
1a009aff73 takeover: prevent crash by avoiding free in traverse on RST timeout
After 5 attempts to send a RST to a client without any response, we free
"con"; this is done during a traverse.  This frees the node we are walking
through (the node is made a child of "con" down in rb_tree.c's
trbt_create_node() (Valgrind would catch this, as Martin confirmed).

So, we create a temporary parent and reparent onto that; then we free
that parent after the traverse, thus deleting the unwanted nodes.

CQ:S1019041
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 08f7f85477610a4916c1ec866aa467b28f1bbec3)
2010-08-18 11:40:17 +09:30
Martin Schwenke
6ce1501aa1 Move NAT gateway firewall rules to recovered|updatenatgw events.
The existing code wasn't working as designed in the start event.  It
should work here.

BZ: 62613
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit aeb70c7e7822854eb87873a5c7783e27e6e72318)
2010-08-18 11:40:07 +09:30
Rusty Russell
5f2d43157d vacuum: disabling vacuuming during a freeze
We shouldn't even think about vacuuming when we've frozen the database
(which is earlier than when we set CTDB_RECOVERY_ACTIVE)

CQ:S1018154 & S1018349
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit d8df6835a931082af232c4b94f1dede6f16169f9)
2010-08-18 11:01:52 +09:30
Rusty Russell
0b07f91d36 vacuum: fix crash on vacuum abort
Martin Schwenke discovered that 517f05e42f17766b1e8db8f1f4789cbad968e304
("freeze: abort vacuuming when we're going to freeze.") used ctdb_db for
a logging message which is in fact uninitialized, causing a crash (even
if it wasn't actually logged).

Initialize it properly.  Also fix incorrect format in another logging
message introduced in that same change.

CQ:S1019093
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 8e518950ba281502318d6300f7a5ec6cdf6b5674)
2010-08-18 11:00:11 +09:30
Martin Schwenke
4e9fe3545c Test suite: loosen the getmonmode test.
Monitoring could be off at the beginning of the test.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 6a33a7715067175869ea2f3f15b64c3371079a6b)
2010-08-18 11:25:44 +10:00
Rusty Russell
af55c910a4 freeze: abort vacuuming when we're going to freeze.
There are some reports of freeze timeouts, and it looks like vacuuming might
be the culprit.  So we add code to tell them to abort when a freeze is
going on.

(This is based on the 1.0.112 branch version 517f05e42f, but far
 simpler since tdb is now robust against processes being killed during
 transaction commit)

CQ:S1018154 & S1018349
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit f5d7dc679501e607c2c83a248a89d3cada9df146)
2010-08-18 10:54:28 +09:30
Ronnie Sahlberg
44ff992806 Add a new "ctdb addtickle" command to manually add tickles to ctdbd
This can be used to set ctdbd up to generate a tickle for non-samba
services.
(samba contains code to set tickles up automatically)

(This used to be ctdb commit 7ef2cddad5326fdcc26138906948342039829495)
2010-08-18 11:09:32 +10:00
Ronnie Sahlberg
0e5be63bca update the example for the new signature of
ctdb_set_message_handler_send()

(This used to be ctdb commit 6aabe52d5ba629291aa630bc96a2b74dcecc5209)
2010-08-18 10:18:35 +10:00
Ronnie Sahlberg
e8ffb0d8a4 We use eventloop nesting in a couple of places, notably the sync
parts of the recovery daemon.

Initialize all event contexts to allow nesting

(This used to be ctdb commit 5bf6bd5e7f33aabbeb7b9707716ef99cf471e590)
2010-08-18 10:11:59 +10:00
Ronnie Sahlberg
ddf3c621c1 Merge commit 'rusty/libctdb-new' into foo
(This used to be ctdb commit 1566d2d23ab698896b3b6a76974a5c7452db4a62)
2010-08-18 09:53:52 +10:00
Rusty Russell
f93440c4b7 event: Update events to latest Samba version 0.9.8
In Samba this is now called "tevent", and while we use the backwards
compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now
a separate tevent_fd_set_auto_close() function.

This is based on Samba version 7f29f817fa.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)
2010-08-18 09:16:31 +09:30
Rusty Russell
532e4a7077 talloc: update to 2.0.3 version from SAMBA
This is based on SAMBA as at revision 2de63aa280.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit cecd93be0a0aab868430dd43f8276bfb4e35f02e)
2010-08-18 09:11:58 +09:30
Martin Schwenke
a3e9fe2058 Test suite: Add more timestamping of debugging information.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4cdf3b9adc7edfd80a2901ef8457ae67aab0829a)
2010-08-17 09:55:48 +10:00
Martin Schwenke
e28d2b8f22 Test suite: print date/time at test completion.
This should help with log cross-checking.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c0a916c40c623c0aa8245526283a064dbeea4b57)
2010-08-17 09:52:15 +10:00
Volker Lendecke
a79168f587 Correctly set docdir
(This used to be ctdb commit a69916d0687309766b0014dc9cee6a966aaa89da)
2010-08-16 11:28:05 +10:00
Rusty Russell
c27094742b tdb: workaround starvation problem in locking entire database.
(Imported from SAMBA 11ab43084b)

We saw tdb_lockall() take 71 seconds under heavy load; this is because Linux
(at least) doesn't prevent new small locks being obtained while we're waiting
for a big log.

The workaround is to do divide and conquer using non-blocking chainlocks: if
we get down to a single chain we block.  Using a simple test program where
children did "hold lock for 100ms, sleep for 1 second" the time to do
tdb_lockall() dropped signifiantly.  There are ln(hashsize) locks taken in
the contended case, but that's slow anyway.

More analysis is given in my blog at http://rusty.ozlabs.org/?p=120

This may also help transactions, though in that case it's the initial
read lock which uses this gradual locking routine; the update-to-write-lock
code is separate and still tries to update in one go.

Even though ABI doesn't change, minor version bumped so behavior change
can be easily detected.

CQ:S1018154
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 9ec0009443a0ac4187ce5212a5143689daa58a02)
2010-08-16 10:22:21 +09:30
Rusty Russell
546eff9c93 tdb: Fix tdb_check() to work with read-only tdb databases.
(Import from SAMBA bc1c82ea13)
The function tdb_lockall() uses F_WRLCK internally, which doesn't work on
a fd opened with O_RDONLY. Use tdb_lockall_read() instead.

(This used to be ctdb commit a5db1122ec48d7e7384066848457c850c1a6cf3c)
2010-08-16 10:20:59 +09:30
Rusty Russell
fa2a32d5ef tdb: remove unused variable in tdb_new_database().
(Imported from SAMBA 2eab1d7fdc)

(This used to be ctdb commit 52a87e608d0406aee9df99f7ac3ce16e834b520b)
2010-08-16 10:20:53 +09:30