1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-30 13:18:05 +03:00
Commit Graph

752 Commits

Author SHA1 Message Date
Amitay Isaacs
cbffbb7c2f ctdb-daemon: Do not run monitor event if any other event is already running
Any currently running monitor events are cancelled if any other events
are scheduled.  However, this does not stop monitor events to be run
when other events are already running.

Keep track of the number of active events and schedule monitor event
only if there are no active events.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-21 11:30:41 +11:00
Martin Schwenke
e6304d1e1a ctdb/daemon: Untangle serialisation of 1st recovery -> startup -> monitor
At the moment ctdb_check_healthy() is overloaded to wait until the
first recovery is complete, handle the "startup" event and also
actually handle monitoring.  This is untidy and hard to follow.

Instead, have the daemon explicitly wait for 1st recovery after the
"setup" event.  When first recovery is complete, schedule a function
to handle the "startup" event.  When the "startup" event succeeds then
explicitly enable monitoring.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-01-17 17:59:41 +11:00
Amitay Isaacs
a92fd11ad1 ctdb-daemon: Remove ctdb_fork_with_logging()
This function has been replaced with ctdb_vfork_with_logging().

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Jan 16 04:05:35 CET 2014 on sn-devel-104
2014-01-16 04:05:35 +01:00
Amitay Isaacs
2879404388 ctdb-daemon: Add ctdb_vfork_with_logging()
This will be used to spawn lightweight helper processes to run
eventscripts.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 11:41:12 +11:00
Amitay Isaacs
7aa20ccb5c ctdb-daemon: No need to call event scripts with CTDB_CALLED_BY_USER
This was added to support external monitoring using CTDB event scripts.
However, it was never used.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 11:41:12 +11:00
Amitay Isaacs
bafa467021 ctdb-daemon: Deprecate RELOAD and STATUS events
These events have never been used.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 11:41:12 +11:00
Amitay Isaacs
d21919c8b4 ctdb-common: Refactor code to keep track of child processes
This code can then be used to track child processes created with vfork().

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Amitay Isaacs
094f34e9bf ctdb-locking: Implement active lock requests limit per database
This limit was currently a global limit and not per database.  This
prevents any database freeze lock requests from getting scheduled if
the global limit was reached.

Only individual record requests should be limited and database freeze
requests should always get scheduled.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Martin Schwenke
e850cddcc4 ctdb-tools/ctdb: New ptrans command
Also add test.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Martin Schwenke
028fe930b6 ctdb-recoverd: Fix backward compatibility for CTDB_SRVID_TAKEOVER_RUN
When running a mixed version cluster, compatibility with older
versions was was broken during recent refactorisation.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Amitay Isaacs
41d37058ca tunables: Remove obsolete tunables
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit ca5fc3431573c44d55d09d987c715fb53756fc1f)
2013-10-30 15:37:11 +11:00
Amitay Isaacs
7eb680a95f build: Move the default CTDB socket from /tmp to /var/run/ctdb
Use /var/run/ctdb/ctdbd.socket because there might be other daemons
that need sockets in the future.

The local daemons test code to create a link for the default
convenience socket has to be removed because the link can't be created
as a regular user in the new location.  This should be OK since all
calls to the ctdb tool in the test code should be wrapped in onnode.
When debugging tests, a developer will have to set CTDB_SOCKET by
hand.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit dc67a4e24af9d07aead2a1710eeaf5d6cc409201)
2013-10-25 12:06:07 +11:00
Martin Schwenke
b595712f25 ctdbd: Simplify database directory setting logic
No need to check if the options are set.  The options are always set
via static defaults.

No need to talloc_strdup() the values via wrapper functions.  The
options aren't going away.  Remove now unused ctdb_set_tdb_dir() and
similar functions.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 1fe82f3d7b610547ff4945887f15dd6c5798a49b)
2013-10-25 12:06:06 +11:00
Martin Schwenke
bd73e017b0 common: New function ctdb_mkdir_p_or_die()
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 7b971df79b0b63f83555205eacf48d49ca3a273a)
2013-10-25 12:06:06 +11:00
Martin Schwenke
c07e3830b3 common: New function mkdir_p()
Behaves like mkdir -p.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit afe2145d91725daf1399f0a24f1cddcf65f0ec31)
2013-10-25 12:06:06 +11:00
Michael Adam
49fcfd2cb3 ctdb_client.h: fix build on AIX by removing C++-style comments
Reported by John P Janosik <jpjanosi@us.ibm.com>

Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 1f327401f2e181780937aa3f6c479376ff787f3f)
2013-10-23 00:53:56 +02:00
Martin Schwenke
e782b61732 ctdbd: Pass the public address file location in ctdb context
No need to pass it as an extra argument to ctdb_start_daemon.

Also ensure options.public_address_list gets a nice static default.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a3d63a9db89d08bb284b3b3a6db773422f21b477)
2013-10-22 15:37:54 +11:00
Martin Schwenke
4adc8f4f09 ctdbd: Default for event_script_dir should use CTDB_BASE
Also get rid of ctdb_set_event_script_dir().  It creates an
unnecessary copy of something that will be around for the lifetime of
the process.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 21b4d1aba00902f1eee0cbf4f082b0794fd5b738)
2013-10-22 15:37:54 +11:00
Martin Schwenke
f9ce563135 ctdbd: Add nodes_file member to struct ctdb_context
This allows ctdb_load_nodes_file() to move to ctdb_server.c and
ctdb_set_nlist() to become static.

Setting ctdb->nodes_file needs to be done early, before the nodes file
is loaded.  It is now set from CTDB_BASE instead ETCDIR, so setting
CTDB_BASE also needs to be done earlier.

Unhack ctdbd_test.c - it no longer needs to define
ctdb_load_nodes_file().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 20e705e63bd3b20837cc3ac92fdcf2a9650ccfc8)
2013-10-22 15:37:54 +11:00
Amitay Isaacs
c5ec04f24e client: Reimplement persistent transaction code using TRANS3_COMMIT
Implementing persistent trasnaction code from Samba.

Persistent transaction code was reimplemented in Samba using g_lock.tdb
to hold transaction locks and using TRANS3_COMMIT control.

Implementation details:

1. When starting a transaction, create a record with "transaction-<dbid>"
   as key and store current server_id in the structure.

2. If a record already exists, some other client has already started a
   transaction.  Verify that the process corresponding to server_id stored
   in the record really exists or it's a stale record and overwrite it.

3. All modifications to the actual persistent database are stored in a
   marshal buffer.

4. When transaction is committed, read the sequence number of the
   persistent database and increment it.  Sequence number record is also
   stored in the marshal buffer.

5. Send the changed records (marshal buffer) in TRANS3_COMMIT control
   to all the active nodes.

6. If all controls succeed, verify that the sequence number has been
   incremented.  Commit is successful.  If any of the controls fail,
   abort the transaction.

7. In case sequence number has not yet been incremented, then database
   recovery has been triggered.  So repeat from step 5.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 4e0f1971792c9431d8d51dc57d54ecc9e4576dd5)
2013-10-04 15:46:15 +10:00
Amitay Isaacs
be33efa3e4 ctdbd: Remove transaction code related to TRANS2 commits
This removes data types and structure elements related to TRANS2
persistent transaction code.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 22a253b7ccf1ff854cddf0b67969dc84d7d6a654)
2013-10-04 15:20:25 +10:00
Amitay Isaacs
91d644325d ctdbd: Deprecate TRANS2 commit controls
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 7d176352986317e63696d74252ff5d8eccb2fee5)
2013-10-04 15:20:25 +10:00
Amitay Isaacs
fe62936bb6 include: Remove unused set_dmaster structure
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 2ce3a48cc969d563c26dd295723416c0d7b077a2)
2013-10-04 15:20:25 +10:00
Amitay Isaacs
4ca9b96114 client: Add ctdb_ctrl_getdbseqnum() function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 8cb1fbbfe88327c9c7ab68e8eded586dff611e57)
2013-10-04 15:15:34 +10:00
Amitay Isaacs
5d47f28e15 client: Add ctdb_ctrl_getdbstatistics() function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 1e7fca5cdc1d7205cf084e35aace1a5dc46ea294)
2013-10-04 15:15:34 +10:00
Amitay Isaacs
105afa543e client: Add ctdb_client_check_message_handlers() function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit c9a9d14c91f203ce964a426a8a1e2c1715af2098)
2013-10-04 15:15:34 +10:00
Martin Schwenke
b33ee7a2a4 recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE
The current implementation has a few flaws:

* A takeover run is called unconditionally when the timer goes even if
  the recovery master role has moved.  This means a node other than
  the recovery master can incorrectly do a takeover run.

* The rebalancing target nodes are cleared in the setup for a takeover
  run, regardless of whether the takeover run succeeds.

* The timer to force a rebalance isn't cleared if another takeover run
  occurs before the deadline.  Any forced rebalancing will happen in
  the first takeover run and when the timer expires some time later
  then an unnecessary takeover run will occur.

* If the recovery master role moves then the rebalancing data will
  stay on the original node and affect the next takeover run to occur
  if the recovery master role should come back to the original node.

Instead, store an array of rebalance target nodes in the recovery
master context.  This is passed as an extra argument to
ctdb_takeover_run() each time it is called and is cleared when a
takeover run succeeds.  The timer hangs off the array of rebalance
target nodes, which is cleared if the node isn't the recovery master.

This means that it is possible to lose rebalance data if the recovery
master role moves.  However, that's a difficult problem to solve.  The
best way of approaching it is probably to try to stop the recovery
master role from jumping around unnecesarily when inactive nodes join
the cluster.

The long term solution is to avoid this nonsense completely.  The IP
allocation algorithm needs to cache state between runs so that it
knows which nodes have just become healthy.  This also needs recovery
master stability.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9)
2013-09-19 12:54:31 +10:00
Martin Schwenke
1793412de2 recoverd: Remove unused CTDB_SRVID_RELOAD_ALL_IPS and handler
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4cd727439a0824ebb8dbcf737d9888ffc3c41184)
2013-09-19 12:54:31 +10:00
Martin Schwenke
5f0913d321 recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS
This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK.  It stops
the IP checks but also causes any attempted takeover runs to fail and
be rescheduled.

This is meant to completely stop IP movements.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56)
2013-09-19 12:54:31 +10:00
Martin Schwenke
4c3f8dc3bb recoverd: Make the SRVID request structure generic
No need for a separate one for each SRVID.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d9c22b04d5aa7938a3965bd3144568664eb772ce)
2013-09-19 12:54:30 +10:00
Martin Schwenke
fe7f66547b client: Remove unused function list_of_active_nodes_except_pnn()
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit d8a76cf79f07dfb5a93c6c9a13f16e3268c7dd57)
2013-09-11 15:35:03 +10:00
Martin Schwenke
1ae731198a recoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c
This is an internal structure.  It was moved into ctdb_private.h a
long time ago to allow unit testing.  Unit test compilation was
changed shortly afterwards to make this unnecessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit db57261d7dc264e161659a8c547f44fbd9e88eeb)
2013-08-22 17:00:20 +10:00
Amitay Isaacs
1467b666f2 Revert "LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node"
This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504.

This is a premature optimization.  Record can bounce between nodes
very quickly if it is a contended record.  There is no need to hold a
record on a node unnecessarily.  In case record contention becomes bad,
enabling sticky records on a database is a better idea.

Conflicts:
	include/ctdb_private.h
	server/ctdb_tunables.c

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit ac417b0003f0116f116834ad2ac51482d25cfa0d)
2013-08-22 14:08:52 +10:00
Amitay Isaacs
de6b97ce4f Revert "recoverd: Use correct tdb flags when creating missing databases"
This reverts commit 10a057d8e15c8c18e540598a940d3548c731b0b4.

This approach would not work when creating local databases since currently
there is no control to receive TDB flags for remote databases.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit ca61eb776ab862bd269e45ee0f9f96e7e1e0e001)
2013-08-14 14:15:33 +10:00
Amitay Isaacs
f15e1a28a7 recoverd: Use correct tdb flags when creating missing databases
When creating missing databases either locally or remotely, make sure
to use the correct tdb flags from other nodes.  Without this, volatile
databases can get attached without TDB_INCOMPATIBLE_HASH flag.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 10a057d8e15c8c18e540598a940d3548c731b0b4)
2013-08-01 11:08:25 +10:00
Amitay Isaacs
d8fc36781c ctdbd: Remove incomplete ctdb_db_statistics_wire structure
Instead of maintaining another structure, add an element as place holder for
marshall buffer of hot keys.  This avoids duplication of the structure.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit e73b2e12adc9db1dedb48d32bba3a8406a80f4cd)
2013-07-29 16:00:46 +10:00
Amitay Isaacs
854216236b Revert "ctdbd: Remove incomplete ctdb_db_statistics_wire structure"
The structure cannot be removed without adding support for marshalling keys
for hot records.

This reverts commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 023ca2e84f5ed064a288526b9c2bc7e06674dd81)
2013-07-29 16:00:46 +10:00
Amitay Isaacs
500b26e48f common/system: Add ctdb_set_process_name() function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit fc3689c977f48d7988eed0654fb8e5ce4b8bfc8b)
2013-07-10 14:33:19 +10:00
Amitay Isaacs
d46c24f4d0 ctdbd: No need for DeadlockTimeout tunable
The code for deadlock detection and killing smbd process causing deadlock
has been removed and replaced with external debug script.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 2211cd94bea266547d3e6f167d3160a6b23bec88)
2013-07-10 14:33:18 +10:00
Amitay Isaacs
d36aa928fd ctdbd: Remove incomplete ctdb_db_statistics_wire structure
Send the ctdb_db_statistics directly instead of first copying it to
duplicate ctdb_db_statistics_wire structure.  This simplifies the
implementation of the control to get database statistics.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5)
2013-07-10 14:33:18 +10:00
Martin Schwenke
7290798a41 recoverd: Clean up log messages in remote IP verification
The log messages in verify_remote_ip_allocation() are confusing
because they don't include the PNN of the problem node, because it is
not known in this function.

Add the PNN of the node being verified as a function argument and then
shuffle the log messages around to make them clearer.

Also fold 3 nested if statements into just one.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit f0942fa01cd422133fc9398f56b4855397d7bc86)
2013-07-05 15:52:33 +10:00
Martin Schwenke
dbd1759eae util: New function ctdb_die()
This is like ctdb_fatal() but exits cleanly without dumping core or
generating a backtrace.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c0a9456692c88a7a5542cd893d8f326524d3f94e)
2013-07-05 15:52:33 +10:00
Amitay Isaacs
c6914e3891 banning: Make ctdb_local_node_got_banned() a void function
When this function is called, we are already committed to banning
and there is no point in failing this function.  In case, freezing of
databases fails, it will be fixed from recovery daemon.

(This used to be ctdb commit bb178338658b4ae32382a1f62f7c21cee1d4878f)
2013-07-02 12:59:08 +10:00
Amitay Isaacs
622ccd09f9 freeze: Make ctdb_start_freeze() a void function
If this function fails due to memory errors, there is no way to recover.
The best course of action is to abort.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 46efe7a886f8c4c56f19536adc98a73c22db906a)
2013-07-02 12:59:08 +10:00
Martin Schwenke
6a52a87028 ctdbd: Refactor shutdown sequence
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit b32fd04bfbf33062d45365b37a7247e272a76ceb)
2013-06-22 15:51:02 +10:00
Martin Schwenke
6d9667f01c ctdbd: Add new runstate CTDB_RUNSTATE_FIRST_RECOVERY
This adds more serialisation to the startup, ensuring that the
"startup" event runs after everything to do with the first recovery
(including the "recovered" event).

Given that it now takes longer to get to the "startup" state, the
initscript needs to wait until ctdbd gets to "first_recovery".

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7)
2013-05-24 14:08:07 +10:00
Martin Schwenke
77671b9ef5 ctdbd: New control CTDB_CONTROL_GET_RUNSTATE
Also new client function ctdb_ctrl_get_runstate().

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit dc4220e6f618cc688b3ca8e52bcb3eec6cb55bb1)
2013-05-24 14:08:07 +10:00
Martin Schwenke
63577c96db ctdbd: Replace ctdb->done_startup with ctdb->runstate
This allows states, including startup and shutdown states, to be
clearly tracked.  This doesn't include regular runtime "states", which
are handled by node flags.

Introduce new functions ctdb_set_runstate(), runstate_to_string() and
runstate_from_string().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28)
2013-05-24 14:08:06 +10:00
Amitay Isaacs
1ddc7b0d10 locking: Remove functions that are not used anymore
These functions were used in locking child process to do the locking.  With
locking helper, these are not required.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit c660f33c3eaa1b4a2c4e951c1982979e57374ed4)
2013-05-24 09:06:40 +10:00
Martin Schwenke
54e91df60d recoverd: Move IP flags into ctdb_takeover.c
These should never be seen outside the IP allocation code.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e143abd16ccde2e0edfe103673d31a5fb06b6aef)
2013-05-09 12:55:42 +10:00