1
0
mirror of https://github.com/samba-team/samba.git synced 2025-02-01 05:47:28 +03:00

428 Commits

Author SHA1 Message Date
Stefan Metzmacher
0e436b46c6 client: add ctdb_ctrl_getdbhealth()
metze

(This used to be ctdb commit 5abe44d0113839d3a45c9a31d30856aa70c2ea1f)
2009-12-16 08:08:32 +01:00
Stefan Metzmacher
003985acfd ctdb: pass TDB_DISALLOW_NESTING to all tdb_open/tdb_wrap_open calls
metze

Signed-off-by: Stefan Metzmacher <metze@samba.org>

(This used to be ctdb commit 1635e931b909c66eb3b1f5357e3a549b1a0da70d)
2009-12-16 08:03:55 +01:00
Michael Adam
c2c9a04cf2 client: lower level of commit retry message WARNING->DEBUG
This can happen frequently when recoveries intercept transactions.

Michael

(This used to be ctdb commit c46adb210e47530488503e20d682d4d182c0fb79)
2009-12-09 21:56:59 +01:00
Michael Adam
97d780bc20 client: lower debug level of transaction-active-retry message to DEBUG
This reduces some noise.

Michael

(This used to be ctdb commit 54d227811753f4a87f1a2c9dc0b1389f5ca2a12f)
2009-12-09 21:56:59 +01:00
Rusty Russell
a46c3b4f2a ctdb: scriptstatus can now query non-monitor events
We also no longer return an error before scripts have been run; a special
zero-length data means we have never run the scripts.

"ctdb scriptstatus all" returns all event script results.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 9b90d671581e390e2892d3a68f3ca98d58bef4df)
2009-12-08 01:50:55 +10:30
Rusty Russell
9753b7e793 eventscript: rename ctdb_monitoring_wire to ctdb_scripts_wire
We're going to allow fetching status of all script runs, so this
name is no longer appropriate.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit f5cb41ecf3fa986b8af243e8546eb3b985cd902a)
2009-12-08 00:51:24 +10:30
Rusty Russell
c309d22f9a eventscript: remove unused ctbd_ctrl_event_script*
The child no longer uses ctdb_ctrl_event_script_init or
ctdb_ctrl_event_script_finished, and the others are redundant: it
doesn't need to tell us it's starting a script when it only runs one.

We move start and stop calls to the parent, and eliminate the RPC
infrastructure altogether.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 391926a87a7af73840f10bb314c0a2f951a0854c)
2009-12-08 00:27:40 +10:30
Rusty Russell
b9b75bd065 eventscript: use -ENOEXEC for disabled status value
This unifies code paths and simplifies things: we just hand -ENOEXEC to
ctdb_ctrl_event_script_stop().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit eadf5e44ef97d7703a7d3bce0e7ea0f21cb11f14)
2009-12-07 23:11:47 +10:30
Michael Adam
98c108fa33 client: improve two error messages in ctdb_transaction_commit().
Michael

(This used to be ctdb commit d971b2ca84c0451dc7e5acbf4a5ade06270a2044)
2009-12-04 15:06:54 +01:00
Michael Adam
cc7438d87d client: increase the number of commit retries 10-->100
To cope with timeouts when recoveries and transactions collide.
Maybe 100 is too high.

Michael

(This used to be ctdb commit c23d804165e84bdf95ba960c953c736d361011d7)
2009-12-04 15:03:16 +01:00
Michael Adam
b3fd495522 client: untangle checks and produce more detailed error messages
in ctdb_transaction_fetch_start

Michael

(This used to be ctdb commit 428914377851a98b3fc893798783fbfebffc1c0d)
2009-12-04 15:03:16 +01:00
Michael Adam
7afefed6ae client: increase the rsn of the __transaction_lock__ when storing
So that it is correctly handled by recoveries.
Also explicitly set the dmaster field to the current node's pnn.

Michael

(This used to be ctdb commit 03a5bb727b9db1ba952632f08ceb5355f0df842d)
2009-12-04 15:02:41 +01:00
Michael Adam
0635f8b98f make ctdb_ctrl_transaction_active public.
Michael

(This used to be ctdb commit e5496a83ef4a01604195b27c4b97f50d4979510e)
2009-12-04 11:30:22 +01:00
Michael Adam
27dc0adfb5 client: in catdb, print the keyname first, and separate records by a blank line
Michael

(This used to be ctdb commit b9882710e12f28c96a0af298e419160f00578241)
2009-12-04 11:30:21 +01:00
Michael Adam
c532347a45 client: randomize the transaction_start retry loop:
instead of sleeping 1 second, sleep between 1 and 100 milliseconds

Michael

(This used to be ctdb commit a5d90d8ed8b44355c4ffb9c32ded772025fcc174)
2009-10-30 22:02:21 +11:00
Michael Adam
118185670d client: fix a race in the local race condition fix in transaction_start
The gap that remained is between checking whether a transaction commit
is in progress and taking the lock. Now we first take the lock and then
check whether a transaction commit is in progress. If so, we release the
lock, wait for one second and retry.

Michael

(This used to be ctdb commit b95524c08bf12914120cb6c818ecc1c99738fe37)
2009-10-30 22:01:16 +11:00
Michael Adam
c2855a11a8 client: add a debug message when a transaction_commit needs to be retried
Michael

(This used to be ctdb commit 9e4902c7d3ad1329c296f4196fcb1396f2a7a6a0)
2009-10-30 22:00:42 +11:00
Michael Adam
45c17515c3 client: log db_id as 8-digit hex in ctdb_transaction_fetch_start()
Michael

(This used to be ctdb commit d7b9babda2f7c7f7b95ee19ec75c37200816c6ef)
2009-10-30 09:28:49 +11:00
Michael Adam
361aec199e client: improve "control timed out" debug message
* add __location__
* wrap overly long line
* print unsigned ints as unsigned (reqid, opcode, destnode)

Michael

(This used to be ctdb commit 6b47ea111867c845974aa2687a658ebca2854816)
2009-10-30 09:22:52 +11:00
Ronnie Sahlberg
023d09cd38 Revert "update the "uptime" command to indicate the "time since last" is the time since the last recovery OR failover."
This reverts commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36.

(This used to be ctdb commit cb36bbb5418290e8e5b770d2d836285b15da2a6f)
2009-10-29 10:49:00 +11:00
Ronnie Sahlberg
279b7ca564 update the "uptime" command to indicate the "time since last" is the time since the last recovery OR failover.
(This used to be ctdb commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36)
2009-10-29 10:37:10 +11:00
Michael Adam
2419eab0d9 ctdb_client: reformat a comment slightly to enhance clearness.
Michael

(This used to be ctdb commit 9560f8b7fe0f7ee0386a87c2653333071050fe4b)
2009-10-29 10:15:54 +11:00
Michael Adam
5d579cf665 client: fix race condition with concurrent transactions on the same node.
In ctdb_transaction_commit(), when the trans2_commit control fails, there
is a race condition in the 1 second sleep between the local transaction_cancel
and the call to ctdb_replay_transaction(): The database is not locked, and
neither is the transaction_lock record. So another client can start and possibly
complete a new transaction in this gap, but only on the same node: The locking
of the transaction_lock record on a different node which involves migration of
the record to the other node has been disabled by introduction of the
transaction_active flag on the db which closes precisely this gap from the start
of the commit until the call to TRANS2_FINISH or TRANS2_ERROR.
But this mechanism does not cover the case where a process on the same node
tries to start a transaction: There is no obstacle to locking the transaction_lock
record because the record does not need to be migrated.

This commit closes this race condition in ctdb_transaction_fetch_start()
by using the new ctdb_ctrl_transaction_active() call to ask the local
ctdb daemon whether it has a transaction running on the database.
If so, the check is repeated until the running transaction is done.

This does introduce an additional call to the local ctdbd when starting
transactions, but it does close the (hopefully) last race condition.

Michael

(This used to be ctdb commit 02ee9dfd3c6b09f5c5172a7e38738c20b7f0aecd)
2009-10-29 10:15:21 +11:00
Michael Adam
953ccee5c5 client: add ctdb_ctrl_transaction_active() which calls out to CTDB_TRANS2_ACTIVE
Michael

(This used to be ctdb commit 813cfd7c625ac8af4ef169cc92fb6d69f66004c9)
2009-10-29 10:15:00 +11:00
Ronnie Sahlberg
4d40b86805 for debugging
add a global variable holding the pid of the main daemon.
change the tracking of time() in the event loop to only check/warn when called from the main daemon

(This used to be ctdb commit a10fc51f4c30e85ada6d4b7347b0f9a8ebc76637)
2009-10-27 13:18:52 +11:00
Stefan Metzmacher
1c6829f3c2 ctdb_client: fix DEBUG statement in ctdb_ctrl_modflags()
metze

(This used to be ctdb commit a244b75ee49556b0ff51e254cc812594ee3b23a7)
2009-10-26 14:22:07 +11:00
Ronnie Sahlberg
73c0adb029 initial attempt at freezing databases in priority order
(This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2)
2009-10-12 12:08:39 +11:00
Ronnie Sahlberg
d4c98516a2 uptade the freeze/thaw commands to be able to send the requested database priority to freeze/thaw to the daemon.
this is encoded in the srvid field of the request header

(This used to be ctdb commit 0cb3d33caa42ed783e03bc825b181dde4cf63616)
2009-10-12 09:22:17 +11:00
Ronnie Sahlberg
3219f81710 add a control to read the db priority from a database
(This used to be ctdb commit ca6d045e419f308f57e74d4c978907afb05ddb85)
2009-10-10 15:04:18 +11:00
Ronnie Sahlberg
6cf7d8e131 add a control to set a database priority. Let newly created databases default to priority 1.
database priorities will be used to control in which order databases are locked during recovery in.

(This used to be ctdb commit 67741c0ee01916d94cace8e9462ef02507e06078)
2009-10-10 14:26:09 +11:00
Ronnie Sahlberg
71e4259150 add a new function to collect a list of all active nodes EXCEPT a certain node
(This used to be ctdb commit be52954d921e7d443304cf49fbd488c619a9c4ec)
2009-10-06 10:52:31 +11:00
Michael Adam
3cb4bcd211 ctdb_client: fix race in starting concurrent transactions on a single node
There are two races in concurrent transactions on a single node.
One in starting a transaction, and one with committing (replaying).

This commit closes the first race by storing the pid in the
transaction-lock record and comparing the own pid against it
as a measure to prevent starting a second transaction when
a second node has come inbetween and changed the pid in the lock
record.

Michael

(This used to be ctdb commit 84e5a55a900b01903b80e23045edfc726d8d77a1)
2009-09-21 11:16:18 +02:00
Ronnie Sahlberg
cda5f02c7c new prototype banning code
(This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a)
2009-09-04 02:20:39 +10:00
Ronnie Sahlberg
1cc79905ad add new controls to make it possible to enable/disable individual eventscripts
update scriptstatus output so it lists disabled scripts

(This used to be ctdb commit 7e799b7523c9699bd65a8a8207f7e03d668b0b81)
2009-08-13 13:04:08 +10:00
Michael Adam
572d397626 client: fix a debug message (misplaced newline).
Michael

(This used to be ctdb commit c513a31d755003d7af91529790b06ce0d226c90f)
2009-08-04 09:46:39 +02:00
Michael Adam
cfbdba0be6 client:ctdb_control_send: remove duplicate setting of the reqid header.
Michael

(This used to be ctdb commit 875778fbbfd6b0f09fd2db76f7348ad6271350a3)
2009-08-04 09:46:39 +02:00
Michael Adam
a6bd36933a client: refuse to do record_store() on a persistent tdb.
Only allow stores wrapped in transactions on persistent dbs.

Michael

(This used to be ctdb commit 9dea71cf72ef79a9aadf8ee7cf1a1899527459ff)
2009-07-29 11:17:07 +10:00
Michael Adam
188ab0f96c client: set dmaster in ctdb_transaction_store() also when updating an existing record
Michael

(This used to be ctdb commit e9194a130327d6b05a8ab90bd976475b0e93b06d)
2009-07-29 10:28:35 +10:00
Ronnie Sahlberg
62c4a841d2 When processing the stop node control reply in the client code we should
also check the returned status code in case the _stop() command failed
due to the eventscripts failing.

If this happens, make "ctdb stop" log an error to the console and try
the operation again.

(This used to be ctdb commit 20e82e0c48e07d1012549f5277f1f5a3f4bd10d1)
2009-07-29 09:58:40 +10:00
Ronnie Sahlberg
37d68c58b8 add two commands : setlmasterrole and setrecmasterrole to enable/disable these capabilities at runtime
(This used to be ctdb commit 51aaed0e9e42e901451292e8dd545297ab725a62)
2009-07-28 13:45:13 +10:00
Ronnie Sahlberg
72e2380e92 add a command "setnatgwstate {on|off}" that can be used to indicate if this node is using natgw functionality or not.
(This used to be ctdb commit 89a9bb29a60a6fb1fba55987e6cf0a4baa695e50)
2009-07-28 09:58:11 +10:00
Ronnie Sahlberg
88f3c40d9c add two new controls, CTOP_NODE and CONTINUE_NODE
that are used to stop/continue a node instead of using modflags messages

(This used to be ctdb commit 54b4a02053a0f98f8c424e7f658890254023d39a)
2009-07-09 12:22:46 +10:00
Ronnie Sahlberg
5b235c3999 add a control to set the reclock file
(This used to be ctdb commit 36cc2e586f03fa497ee9b06f3e6afc80219c4aaa)
2009-06-25 14:25:18 +10:00
Ronnie Sahlberg
10db6a41df return NULL and not a "" when there is no reclock file returned from the server
(This used to be ctdb commit 6755f89f81aba63bfe00ee16d44a0201cbfa90ca)
2009-06-25 12:26:14 +10:00
Ronnie Sahlberg
2b253c094c add a control to read the current reclock file from a node
(This used to be ctdb commit ed6a4cbcdcbb4e0df83bec8be67c30288bf9bd41)
2009-06-25 12:17:19 +10:00
Ronnie Sahlberg
26e1486db7 Whitespace changes and using the CTDB_NO_MEMORY() macro changes to
the previous patch.

(This used to be ctdb commit d623ea7c04daa6349b42d50862843c9f86115488)
2009-05-21 11:49:16 +10:00
Sumit Bose
2fcedf6dac add missing checks on so far ignored return values
Most of these were found during a review by Jim Meyering <meyering@redhat.com>

(This used to be ctdb commit 3aee5ee1deb4a19be3bd3a4ce3abbe09de763344)
2009-05-21 11:22:21 +10:00
Ronnie Sahlberg
0d48af4741 add additional log info to track if/why we cant switch to client mode.
(This used to be ctdb commit 722171fc94a36ffe9e0a5c64502b916fde0a13a4)
2009-05-14 18:25:00 +10:00
Ronnie Sahlberg
98a54c4675 Track how long it takes to take out the recovery lock from both the main dameon and also from the recovery daemon.
Log this in "ctdb statistics".

Also add a varaible "RecLockLatencyMs" that will log an error everytime it takes longer than this to access the reclock file.

(This used to be ctdb commit 042377ed803bb8f7ca9d6ea1a387427b7b8ba45a)
2009-05-14 10:33:25 +10:00
root
629d5ee1fa add a new command "ctdb scriptstatus"
this command shows which eventscripts were executed during the last monitoring cycle and the status from each eventscript.

If an eventscript timedout or returned an error we also
show the output from the eventscript.

Example :
[root@rcn1 ctdb-git]# ./bin/ctdb scriptstatus
6 scripts were executed last monitoring cycle
00.ctdb              Status:OK    Duration:0.021 Mon Mar 23 19:04:32 2009
10.interface         Status:OK    Duration:0.048 Mon Mar 23 19:04:32 2009
20.multipathd        Status:OK    Duration:0.011 Mon Mar 23 19:04:33 2009
40.vsftpd            Status:OK    Duration:0.011 Mon Mar 23 19:04:33 2009
41.httpd             Status:OK    Duration:0.011 Mon Mar 23 19:04:33 2009
50.samba             Status:ERROR    Duration:0.057 Mon Mar 23 19:04:33 2009
   OUTPUT:ERROR: Samba tcp port 445 is not responding

Add a new helper function "switch_from_server_to_client()" which both
the recovery daemon can use as well as in the child process we start for running the actual eventscripts.

Create several new controls, both for the eventscript child process to inform the master daemon of the current status of the scripts as well as for the ctdb tool to extract this information from the runninc daemon.

(This used to be ctdb commit c98f90ad61c9b1e679116fbed948ddca4111968d)
2009-03-23 19:07:45 +11:00