1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-03 01:18:10 +03:00
Commit Graph

53 Commits

Author SHA1 Message Date
Amitay Isaacs
622ccd09f9 freeze: Make ctdb_start_freeze() a void function
If this function fails due to memory errors, there is no way to recover.
The best course of action is to abort.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 46efe7a886f8c4c56f19536adc98a73c22db906a)
2013-07-02 12:59:08 +10:00
Amitay Isaacs
cf17247d31 freeze: If priority is invalid here, it's time to abort
ctdb_start_freeze() is called from ctdb_control_freeze() which fixes the
priority if it's 0 and return error if it's invalid.  Other callers of
ctdb_start_freeze() are internal to CTDB.  So if priority is invalid in
ctdb_start_freeze(), definitely something is seriously wrong.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 87716e8f504d659515d3dbcf93badbf106873bc8)
2013-07-02 12:59:08 +10:00
Amitay Isaacs
6fe0089bc0 freeze: Log message from ctdb_start_freeze() and ctdb_control_freeze()
This ensures that whenever databases are frozen either via sending
control or by calling ctdb_start_freeze(), the action is logged.
Since ctdb_control_freeze() calls ctdb_start_freeze(), move logging of
message in early return condition if databases are already frozen.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 478e24bceda3fedfba54ccb48faa115df726b819)
2013-07-02 12:57:03 +10:00
Mathieu Parent
d82b9ae410 build: Fix tdb.h path to enable building with system TDB library
(This used to be ctdb commit f8bf99de3a5f56be67aaa67ed836458b1cf73e86)
2013-06-14 16:45:27 +10:00
Amitay Isaacs
23f83d58a5 ctdb_freeze: Replace locking functions with locking API
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 01ee86d2aafbcda658ef6acc2bba6d6781ae4047)
2012-10-20 02:48:44 +11:00
Amitay Isaacs
c4236ec8fb ctdbd: Return explicit boolean values for function returning bool
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 3de2830ae68241ee95bcc14dc1bb896ff18d86ce)
2012-07-16 12:12:05 +10:00
Amitay Isaacs
7631830152 server: Replace BOOL datatype with bool, True/False with true/false
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 6e5cbe8fff71985e5a2fc16b7e9f2b868011ff5d)
2012-05-28 11:22:25 +10:00
Ronnie Sahlberg
a57eba2bb4 Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process
Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned.
Capture SIGCHLD to track also which child processes have terminated.

Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a

(This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)
2012-05-03 14:03:26 +10:00
Amitay Isaacs
4392591555 Remove explicit include of lib/tevent/tevent.h.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 0681014ca5ed2a9b56f63fdace7f894beccf8a9a)
2012-04-13 17:28:14 +10:00
Ronnie Sahlberg
653c0901d7 This needs more testing first
Revert "ctdbd: call tdb_reopen_all() in freeze child."

This reverts commit 3d9828861c771a060923f3181fa8224e0122bffc.

(This used to be ctdb commit 55c3446c9ba82d24b1d7db92bc3611fd8027b7fb)
2011-03-21 14:25:53 +11:00
Rusty Russell
88c9dec166 ctdbd: call tdb_reopen_all() in freeze child.
In theory, the ctdbd parent shouldn't be holding any locks, but it's a good
idea to always call tdb_reopen_all() after a fork().

(This used to be ctdb commit 3d9828861c771a060923f3181fa8224e0122bffc)
2011-03-21 13:57:53 +11:00
Michael Adam
a81f740f3d When wiping a database, clear the delete_queue.
(This used to be ctdb commit 731a6011ce4a1301f86eacb039955745f2b5d866)
2011-03-14 13:35:46 +01:00
Ronnie Sahlberg
2e8aac6689 Merge commit 'rusty/ports-from-1.0.112' into foo
(This used to be ctdb commit 13e58d92f5f1723e850a82ae030d0ca57e89b1ee)
2010-08-19 13:17:56 +10:00
Ronnie Sahlberg
4c05f1900c Merge commit 'rusty/vacuum-fix-master'
(This used to be ctdb commit dc301b324d2c14a2425a965c076113c4fe97903e)
2010-08-19 13:16:35 +10:00
Rusty Russell
9fbb191b78 logging: give a unique logging name to each forked child.
This means we can distinguish which child is logging, esp. via syslog where we have no pid.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit 68b3761a0874429b90731741f0531f76dcfbb081)
2010-08-18 11:46:32 +09:30
Rusty Russell
af55c910a4 freeze: abort vacuuming when we're going to freeze.
There are some reports of freeze timeouts, and it looks like vacuuming might
be the culprit.  So we add code to tell them to abort when a freeze is
going on.

(This is based on the 1.0.112 branch version 517f05e42f, but far
 simpler since tdb is now robust against processes being killed during
 transaction commit)

CQ:S1018154 & S1018349
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

(This used to be ctdb commit f5d7dc679501e607c2c83a248a89d3cada9df146)
2010-08-18 10:54:28 +09:30
Rusty Russell
f93440c4b7 event: Update events to latest Samba version 0.9.8
In Samba this is now called "tevent", and while we use the backwards
compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now
a separate tevent_fd_set_auto_close() function.

This is based on Samba version 7f29f817fa.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)
2010-08-18 09:16:31 +09:30
Rusty Russell
70082cd669 ctdb_freeze: extend db priority hack to cover serverid.tdb deadlock.
We discovered that recent smbd locks the serverid tdb while
holding a lock on another tdb (locking.tdb):
  7: POSIX  ADVISORY  WRITE smbd-2224318 locking.tdb.0 10600 10600
  22: -> POSIX  ADVISORY  READ  smbd-2224318 serverid.tdb.0 26580 26580

The result is a deadlock against the ctdb_freeze code called for
recovery.  We extend the "notify" workaround to this case, too.

BZ:65158
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


(This used to be ctdb commit dfdaa446cf256854ff6d267dceeb86fbee8bb188)
2010-07-01 21:46:55 +10:00
Ronnie Sahlberg
a4daf81a7c Additional log messages when tdb databases can no longer be chainlocked or chainunlocked
BZ64688

(This used to be ctdb commit b977901a49a9fed45cc8a2fe880eb749f58278f6)
2010-06-08 12:21:20 +10:00
Stefan Metzmacher
94bc40307a server: Use tdb_check to verify persistent tdbs on startup
Depending on --max-persistent-check-errors we allow ctdb
to start with unhealthy persistent databases.

The default is 0 which means to reject a startup with
unhealthy dbs.

The health of the persistent databases is checked after each
recovery. Node monitoring and the "startup" is deferred
until all persistent databases are healthy.

Databases can become healthy automaticly by a completely
HEALTHY node joining the cluster. Or by an administrator
with "ctdb backupdb/restoredb" or "ctdb wipedb".

metze

(This used to be ctdb commit 15f133d5150ed1badb4fef7d644f10cd08a25cb5)
2009-12-16 08:06:10 +01:00
Stefan Metzmacher
9069d3a7fb server: move error handling to a 'fail' label in ctdb_control_transaction_commit()
metze

(This used to be ctdb commit d874463235fa299e83fe562291c688aca3b85cf3)
2009-12-16 08:03:56 +01:00
Ronnie Sahlberg
e17fa0fdee change the lock wait child handling to use a pipe isntead of a socketpair
remove a stray alarm(30) that caused databases to be unlocked after 30 seconds.

(This used to be ctdb commit 12b187f971d857353403393a9850503e0e558672)
2009-11-26 12:08:35 +11:00
Ronnie Sahlberg
e627fae600 if a lock wait child died/finished, we could have released the lockwait handle and set it to NULL before we call the destructors for releaseing the waiters.
The waiters reference the locakwait handle in order to remove itself from the li
nked list which caused a SEGV.

We dont actually need to remove ourselves from this list here since
if the parent freeze_handle holding the list is freed, then all waiters are rele
ased as well, and the only place we actually need to relink the waiter is in ctd
b_freeze_lock_handler, where we want to respond back to the clients and release
the waiters  but we still want to keep the freeze_handle hanging around.

(This used to be ctdb commit e01ab46bafad09a5e320d420734db129d35863bc)
2009-10-22 13:41:28 +11:00
Ronnie Sahlberg
4b7a208b16 allow a pre .95 version of a recovery master to freeze databases on a post .95 node by remapping priority numbers and log this to log.ctdb
(This used to be ctdb commit 343c005367789e108c0320e95d7a264535d68dd8)
2009-10-14 10:14:03 +11:00
Ronnie Sahlberg
3ac5a52969 Port Volkers deadlock avoidance patch to HEAD.
This patch ensures that we lock all non-notify related databases first and
then the notify databases to avoiud a deadlock where samba needs to lock records on two databases at once (and notify being the second database).

Newer versions of samba would instead use the set-db-prio control to set this explicitely on a database per database basis instead of relying on  hardcoded database names. This patch will be reverted in the future when all updated versions of samba has been pushed out.

(This used to be ctdb commit 70e7781df1f118a0e2632a9c634f3fd388fa6c8c)
2009-10-14 08:17:49 +11:00
Ronnie Sahlberg
122c423b82 add a new control for explicitely cancelling recovery transactions, i.e. the
transactions we start across all tdb databased during the recovery.

this allows us to properly clean up and delete these tdb transactions on a
recovery failure.

(This used to be ctdb commit b2ce8b900a7d00944c84e0574fea5b371064a06d)
2009-10-12 16:48:05 +11:00
Ronnie Sahlberg
73c0adb029 initial attempt at freezing databases in priority order
(This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2)
2009-10-12 12:08:39 +11:00
Ronnie Sahlberg
d4c98516a2 uptade the freeze/thaw commands to be able to send the requested database priority to freeze/thaw to the daemon.
this is encoded in the srvid field of the request header

(This used to be ctdb commit 0cb3d33caa42ed783e03bc825b181dde4cf63616)
2009-10-12 09:22:17 +11:00
Ronnie Sahlberg
6cf7d8e131 add a control to set a database priority. Let newly created databases default to priority 1.
database priorities will be used to control in which order databases are locked during recovery in.

(This used to be ctdb commit 67741c0ee01916d94cace8e9462ef02507e06078)
2009-10-10 14:26:09 +11:00
Ronnie Sahlberg
96340bd166 Revert "we only need to have transaction nesting disabled when we start the new transaction for the recovery"
This reverts commit bf8dae63d10498e6b6179bbacdd72f1ff0fc60be.

(This used to be ctdb commit 87292029cb444ffab130ff7dae47a629c2d15787)
2009-05-25 16:55:27 +10:00
Ronnie Sahlberg
270907faec Revert "set the TDB_NO_NESTING flag for the tdb before we start a transaction from within recovery"
This reverts commit 1b2029dbb055ff07367ebc1f307f5241320227b2.

(This used to be ctdb commit 9762a3408f10409b629637d237ec513a825a6059)
2009-05-25 16:55:02 +10:00
Ronnie Sahlberg
3a6ace330e we only need to have transaction nesting disabled when we start the new transaction for the recovery
(This used to be ctdb commit bf8dae63d10498e6b6179bbacdd72f1ff0fc60be)
2009-04-26 08:48:15 +10:00
Ronnie Sahlberg
d20bb2498d set the TDB_NO_NESTING flag for the tdb before we start a transaction from within recovery
(This used to be ctdb commit 1b2029dbb055ff07367ebc1f307f5241320227b2)
2009-04-26 08:42:54 +10:00
Ronnie Sahlberg
334db8ccba proper waitpid() fix.
remove all waitpid() calls and use the event system to trap sigchld

(This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358)
2008-07-09 14:02:54 +10:00
Ronnie Sahlberg
522830dea8 Revert "waitpid() can block if it takes a long time before the child terminates"
This reverts commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10.

revert the waitpid changes.   we need to waitpid for some childredn so should
refactor the approach completely

(This used to be ctdb commit 702ced6c2fe569c01fe96c60d0f35a7e61506a96)
2008-07-08 17:41:31 +10:00
Ronnie Sahlberg
d67de4a7d2 waitpid() can block if it takes a long time before the child terminates
so we should not call it from the main daemon.

1, set SIGCHLD to SIG_DFL to make sure we ignore this signal

2, get rid of all waitpid() calls

3, change reporting of event script status code from _exit()/waitpid()   to write()/read() one byte across the pipe.

(This used to be ctdb commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10)
2008-07-08 03:48:11 +10:00
Ronnie Sahlberg
64e02585e7 If a transaction commit fails. Log this error and cancel all pending transactions to the
databases instead of calling ctdb_fatal()

(This used to be ctdb commit ff2985aaef999d180277db4cf644fee0ea79c14d)
2008-07-07 08:51:05 +10:00
Ronnie Sahlberg
f25fd04f73 in the destructor for the lock-wait child, make sure that we cancel any pending
transactions.

(This used to be ctdb commit 45b6ff64f6ddf037b810c4e5f8b9f04d71067b98)
2008-07-07 08:50:12 +10:00
Andrew Tridgell
60e5d83cb0 fixed some incorrect CTDB_NO_MEMORY*() calls found after fixing the
_VOID varient

(This used to be ctdb commit 07c9133aedecaee3607ad3b6fa94e5c56417a9de)
2008-07-04 17:04:26 +10:00
Andrew Tridgell
07e145316c zero out the ctdb->freeze_handle when we free it
This prevents heap corruption when a freeze child dies

(This used to be ctdb commit 4edc6d40cb63936146af99030b7819683238abfc)
2008-07-04 16:05:04 +10:00
Ronnie Sahlberg
1ccc4a8e2b test
(This used to be ctdb commit 4f2d722cf29175c3c207e6ebb6d4f9e370767249)
2008-06-26 14:14:37 +10:00
Ronnie Sahlberg
f1b3ddc357 Revert "test"
This reverts commit f71287a28d66db202fe52f9a43b6daf2389d7f66.

(This used to be ctdb commit a928857e38d645baca62cea7f7367488d140dca7)
2008-06-26 14:00:36 +10:00
Ronnie Sahlberg
2cffc2e9c6 test
(This used to be ctdb commit f71287a28d66db202fe52f9a43b6daf2389d7f66)
2008-06-26 13:51:18 +10:00
Ronnie Sahlberg
cfc0af79ce third attempt for fixing a freeze child writing to the socket
(This used to be ctdb commit b8c8c5cb351747863c5d1366b57c96122ade5db0)
2008-06-26 11:52:26 +10:00
Ronnie Sahlberg
2910ea1606 only loop over the write it the write failed
(This used to be ctdb commit b99d687894cb69d863345713055d9c8dc1b29194)
2008-06-26 11:02:08 +10:00
Ronnie Sahlberg
77ef05e95b the write() from the freeze child process can fail
try writing many times and log an error if the write failed

(This used to be ctdb commit f15b224e42e81cda84b98f01f919d463e80fb89f)
2008-06-26 09:54:27 +10:00
Andrew Tridgell
f6e53f433b merge from ronnie
(This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)
2008-02-04 20:07:15 +11:00
Andrew Tridgell
748843a3c6 added paranoid transaction ids
(This used to be ctdb commit afc1da53873cdbd31fcc8c6b22fae262e344cf6e)
2008-01-06 13:24:55 +11:00
Andrew Tridgell
c08f2616cd new simpler and much faster recovery code based on tdb transactions
(This used to be ctdb commit 9ef2268a1674b01f60c58fed72af8ac982fe77a3)
2008-01-06 12:38:01 +11:00
Andrew Tridgell
023a230d9c a useful hack for checking correct behaviour of recovery
(This used to be ctdb commit d88b95a5407b53ead47ca0638ee60653ea3d3d07)
2008-01-05 09:36:21 +11:00