samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-22 13:34:15 +03:00

Author	SHA1	Message	Date
Rusty Russell	6960fa96eb	eventscript: cleanup finished to take state arg We only need ctdb->current_monitor so we can kill it when we want to run something else; we don't need to use it here as we always know what script we are running. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 4cf1b7c32bcf7e4b65aec1fa7ee1a4b162cac889)	2009-12-08 12:24:56 +10:30
Rusty Russell	e548a335bd	eventscript: use wire format internally for script status. The only difference between the exposed an internal structure now is that the name and output fields were pointers. Switch to using ctdb_scripts_wire/ctdb_script_wire internally as well so marshalling is a noop. We now reject scripts which are too long and truncate logging to the 511 characters we have space for (the entire output will be in the normal ctdbd log). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fd2f04554e604bc421806be96b987e601473a9b8)	2009-12-08 12:48:17 +10:30
Rusty Russell	9753b7e793	eventscript: rename ctdb_monitoring_wire to ctdb_scripts_wire We're going to allow fetching status of all script runs, so this name is no longer appropriate. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit f5cb41ecf3fa986b8af243e8546eb3b985cd902a)	2009-12-08 00:51:24 +10:30
Rusty Russell	3ff8bf8138	eventscript: get_current_script() helper This neatens the code slightly. We also use the name 'current' in ctdb_event_script_handler() for uniformity. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit e9661b383e0c50b9e3d114b7434dfe601aff5744)	2009-12-08 12:47:24 +10:30
Rusty Russell	cc678d572f	eventscript: use an array rather than a linked list of scripts This brings us closer to the wire format, by using a simple array and a 'current' iterator. The downside is that a 'struct ctdb_script' is no longer a talloc object: the state must be passed to our log fn, and the current script extracted with &state->scripts->scripts[state->current]. The wackiness of marshalling is simplified, and as a bonus, we can distinguish between an empty event directory (state->scripts->num_scripts == 0) and and error (state->scripts == NULL). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 76e8bdc11b953398ce8850de57aa51f30cb46bff)	2009-12-08 12:47:05 +10:30
Rusty Russell	1eda08ea29	eventscript: record script status for all events This unifies almost everything: the state->current pointer points to the struct ctdb_script where we record start, finish, status and output. We still only marshall up the monitor events; the rest disappear when the state structure is freed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit c476c81f3e3d8fc62f2e53d82fce5774044ee9ce)	2009-12-08 12:46:18 +10:30
Rusty Russell	9b50f7ee67	eventscript: use scripts array directly, rather than separate list We rename ctdb_monitor_script_status to ctdb_script, and instead of allocating them as the scripts are executed, we allocate them up front and keep a "current" interator. This slightly simplifies the code, though it means we only marshall up to the last successfully run script. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit b2a300768536d10bd867a987ad4cf1c5268c44bc)	2009-12-08 12:45:17 +10:30
Rusty Russell	23e24c503c	eventscript: ctdb_fork_with_logging() A new helper functions which sets up an event attached to the child's stdout/stderr which gets routed to the logging callback after being placed in the normal logs. This is a generalization of the previous code which was hardcoded to call ctdb_log_event_script_output. The only subtlety is that we hang the child fds off the output buffer; the destructor for that will flush, which means it has to be destroyed before the output buffer is. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 32cfdc3aec34272612f43a3588e4cabed9c85b68)	2009-12-08 12:44:30 +10:30
Rusty Russell	c309d22f9a	eventscript: remove unused ctbd_ctrl_event_script* The child no longer uses ctdb_ctrl_event_script_init or ctdb_ctrl_event_script_finished, and the others are redundant: it doesn't need to tell us it's starting a script when it only runs one. We move start and stop calls to the parent, and eliminate the RPC infrastructure altogether. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 391926a87a7af73840f10bb314c0a2f951a0854c)	2009-12-08 00:27:40 +10:30
Rusty Russell	69c30c6ba0	eventscript: refactor forking code into fork_child_for_script() We do the same thing in two places: fire off a child from the initial ctdb_event_script_callback_v() and also from the ctdb_event_script_handler() when it's done. Unify this logic into fork_child_for_script(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 814704a3286756d40c2a6c508c1c0b77fa711891)	2009-12-08 00:22:55 +10:30
Rusty Russell	dd53eee7a2	eventscript: fork() a child for each script. We rename child_run_scripts() to child_run_script(), because it now runs a single script rather than walking the list. When it's finished, we fork the next child from the ctdb_event_script_handler() callback. ctdb_control_event_script_init() and ctdb_control_event_script_finished() are now called directly by the parent process; the child still calls ctdb_ctrl_event_script_start() and ctdb_ctrl_event_script_stop() before and after the script. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 0fafdcb8d3532a05846abaa5805b2e2f3cee8f47)	2009-12-08 00:21:25 +10:30
Rusty Russell	640b22ff61	eventscript: store from_user and script_list inside state structure This means all the state about running the scripts is in that structure, which helps in the next patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 020fd21e0905e7f11400f6537988645987f2bb32)	2009-12-08 00:15:18 +10:30
Rusty Russell	b8e347ec9c	eventscript: use direct script state pointer for current monitor We put a "scripts" member in ctdb_event_script_state, rather than using a special struct for monitor events. This will fit better as we further unify the different events, and holds the reports from the child process running each monitor script. Rather than making the monitor state a child of current_monitor_status_ctx, we just point current_monitor directly at it. This means we need to reset that pointer in the destructor for ctdb_event_script_state. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 9a2b4f6b17e54685f878d75bad27aa5090b4571f)	2009-12-08 00:14:01 +10:30
Rusty Russell	a4c2a98ba9	eventscript: make current_monitor_status_ctx serve as monitor_event_script_ctx We have monitor_event_script_ctx and other_event_script_ctx, and current_monitor_status_ctx in struct ctdb_context. This seems more complex than it needs to be. We use a single "event_script_ctx" as parent for all event script state structures. Then we explicitly reparent monitor events under current_monitor_status_ctx: this is freed every script invocation to kill off any running scripts anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 0d925e6f2767691fa561f15bbb857a2aec531143)	2009-12-08 00:09:20 +10:30
Rusty Russell	68e224d9a4	eventscript: split ctdb_run_event_script into multiple parts Simple refactoring in preparation for switching to one-child-per-script. We also call the functions run by the child process "child_". Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit bfee777faff75e9bed4aedc1558957483616a6d3)	2009-12-07 23:55:03 +10:30
Rusty Russell	9a0c171fa7	eventscript: hoist work out of child process, into parent This is the start of a move towards finer-grained reporting, with one child per script. Simple code motion to do sanity check and get the list of scripts before fork(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 816b9177f51ae5b21b92ff4a404f548fe9723c96)	2009-12-07 23:53:35 +10:30
Rusty Russell	928b8dcb31	eventscript: handle banning within the callbacks Currently the timeout handler in eventscript.c does the banning if a timeout happens. However, because monitor events are different, it has to special case them. As we call the callback anyway in this case, we should make that handle -ETIME as it sees fit: for everyone but the monitor event, we simply ban ourselves. The more complicated monitor event banning logic is now in ctdb_monitor.c where it belongs. Note: I wrapped the other bans in "if (status == -ETIME)", though they should probably ban themselves on any error. This change should be a noop. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 9ecee127e19a9e7cae114a66f3514ee7a75276c5)	2009-12-07 23:48:57 +10:30
Rusty Russell	5190932507	eventscript: expost ctdb_ban_self() eventscript.c uses this now, but our next patch makes others use it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit a305cb7743c24386e464f6b2efab7e2108bb1e7e)	2009-12-07 23:18:40 +10:30
Rusty Russell	0dd46797d6	eventscript: handle v. unlikely timeout race If we time out just as the child exits, we currently will report an uninitialized cb_status field. Set it to -ETIME as expected. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 024386931bda9757079f206238ae09bae4de6ea2)	2009-12-07 23:17:23 +10:30
Rusty Russell	d5d88ecaaf	eventscript: replace other -1 returns with -errno This completes our "problem with script" reporting; we never set cb_status to -1 on error. Real errnos are used where the failure is a system call (eg. read, setpgid), otherwise -EIO is used if we couldn't communicate with the parent. The latter case is a bit useless, since the parent probably won't see the error anyway, but it's neater. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 1269458547795c90d544371332ba1de68df29548)	2009-12-07 23:15:56 +10:30
Rusty Russell	672e06f438	eventscript: simplify ctdb_run_event_script loop If we break, we avoid cut & paste code inside the loop. Need to initialize ret to 0 for the "no scripts" case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit ec36ced9446da7e3bf866466d265ee8e18f606c1)	2009-12-07 23:13:12 +10:30
Rusty Russell	c70afe0cd4	eventscript: handle and report generic stat/execution errors Rather than ignoring deleted event scripts (or pretending that they were "OK"), and discarding other stat errors, we save the errno and turn it into a negative status. This gives us a bit more information if we can't execute a script (eg. too many symlinks or other weird errors). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 5d894e1ae5228df6bbe4fc305ccba19803fa3798)	2009-12-07 23:12:19 +10:30
Rusty Russell	b9b75bd065	eventscript: use -ENOEXEC for disabled status value This unifies code paths and simplifies things: we just hand -ENOEXEC to ctdb_ctrl_event_script_stop(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit eadf5e44ef97d7703a7d3bce0e7ea0f21cb11f14)	2009-12-07 23:11:47 +10:30
Rusty Russell	ce378014c7	eventscript: enhance script delete race check We currently assume 127 == script removed. The script can also return 127; best to re-check the execution status in this case (and for 126, which will happen if the script is non-executable). If the script is no longer executable/not present, we ignore it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 0a53d6b5ac81daf0efa32f35e7758ede2a5bdb63)	2009-12-07 23:09:02 +10:30
Rusty Russell	8993d6f523	eventscript: check_executable() to centralize stat/perm checks This is used later in the "script vanished" check. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 8ddb97040842375daf378cbb5816d0c2b031fa65)	2009-12-07 23:09:39 +10:30
Rusty Russell	066a791770	eventscript: use -ETIME for timeout status value This starts the move toward more expressive encoding of return values: positive values mean the script ran, negative means we had a problem with the script (and the value is the errno). This does timeout, but changes the ctdb tool to recognize it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 0eb1d0aa14e68b598d9e281c8a02b8f94a042fd9)	2009-12-07 23:09:42 +10:30
Rusty Russell	85a6f4a4dd	eventscript: marshall onto last_status immediately This simplifies the code a little: last_status is now read to go (it's only used by the scriptstatus command at the moment). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 6be931266a4e41fd0253f760936ad9707dd97c47)	2009-12-07 23:09:40 +10:30
Rusty Russell	774bf144c1	eventscript: reduce code duplication for ending a script, and fix bug Commit 50c2caed57c0 removed a gratuitous talloc_steal from the code in ctdb_control_event_script_finished(), but not ctdb_event_script_timeout(). Easiest to call ctdb_control_event_script_finished() at the bottom of the timeout routine. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 17fa252d0d6981fbae8083a818f26d5ce9c5102e)	2009-12-02 16:15:57 +10:30
Ronnie Sahlberg	569001afd0	Merge commit 'martins/status-test-2' Conflicts: server/eventscript.c (This used to be ctdb commit e9b3477a5b9a2eff18f727e7d59338bfb5214793)	2009-12-01 10:53:18 +11:00
Ronnie Sahlberg	3bc643b46b	remove a stray ) so we compile (This used to be ctdb commit 16db4882635d84b8410a77e2ea8b08d0a257b0ab)	2009-11-27 13:35:39 +11:00
Ronnie Sahlberg	266a163c89	dont use talloc_steal() on a object that is already a child of ctdb. (This used to be ctdb commit 50c2caed57c0520f506eaaeeb0bba2c272da6ef6)	2009-11-27 13:28:31 +11:00
Ronnie Sahlberg	eaa6218def	Merge commit 'martins/status-test' into status-test-2 (This used to be ctdb commit 937823cc73eb098230acff4b1583f6d01f26c21a)	2009-11-27 12:50:45 +11:00
Martin Schwenke	dc2c8dfde1	Merge commit 'martins-svart/status-test-2' into status-test (This used to be ctdb commit 0e6c06ac38fd82adf124d111717502055501974a)	2009-11-27 12:49:31 +11:00
Martin Schwenke	ce06d3de46	Event script infrastructure: add reload event to check_options(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c278c798d41a35f58ca81f8f0e08e4dab51eba9b)	2009-11-27 12:04:02 +11:00
Ronnie Sahlberg	09b9bb2f9f	Merge commit 'martins/status-test' into status-test-2 (This used to be ctdb commit 28d0648725e7de4e4d0e8569e3fbfb0fa1d7f934)	2009-11-26 16:26:25 +11:00
Martin Schwenke	88cd194d6a	Merge commit 'martins-svart/status-test-2' into status-test (This used to be ctdb commit 143f1fa3cc4588505e3992c601153ea08be8432d)	2009-11-26 16:25:15 +11:00
Martin Schwenke	a64ccf07c1	Add flag to ctdb_event_script_callback indicating when called by client. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a1d654a982ca56fade82552f4e6b5586236d3233)	2009-11-26 15:49:49 +11:00
Ronnie Sahlberg	ed4f3ea3cc	resolve some conflicts from merging from martins branch (This used to be ctdb commit d3e7407dc9854ec358d081777c5450ec68b17862)	2009-11-26 13:42:12 +11:00
Martin Schwenke	8029db6a91	Merge commit 'martins-svart/status-test-2' into status-test Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a2830594ebeb54eb51ff90999cb12370aeec6e8b)	2009-11-26 10:49:47 +11:00
Rusty Russell	3188df4a88	eventscript: check that ctdb forced script events correct Now we're doing checking, we might as well make sure the commands from "ctdb eventscripts" are valid. This gets rid of the "UNKNOWN" event type. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 1d24a3869fe89fc9a109fd9e9b69df5fc665a5f6)	2009-11-25 11:02:29 +10:30
Rusty Russell	ff59bb34af	eventscript: check that ctdb forced script events correct Now we're doing checking, we might as well make sure the commands from "ctdb eventscripts" are valid. This gets rid of the "UNKNOWN" event type. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 66b22980b14601f29fe8cc64bd8f29883c7ca1c0)	2009-11-24 11:24:22 +10:30
Rusty Russell	0b4b83aea0	eventscript: check that internal script events are being invoked correctly This is not as good as a compile-time check, but at least we count the number of arguments are correct. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 83b7b233cb4707e826f6ba260bd630c8bc8f1e76)	2009-11-24 11:23:13 +10:30
Rusty Russell	187efa08ab	eventscript: check that internal script events are being invoked correctly This is not as good as a compile-time check, but at least we count the number of arguments are correct. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit a6d353519932eee48f9241ad8887b692882906c9)	2009-11-24 11:23:13 +10:30
Rusty Russell	534c709cba	eventscript: remove call name from state->options Finally, we remove the call name (eg. "monitor" or "start") from the options field of the struct: it now contains only extra options. This is clearer, and mainly involves adding some %s to debug statements. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 33fb0e7ba047ca73969b59bccf70a04a17c25a0a)	2009-11-24 11:22:46 +10:30
Rusty Russell	0ef91a4e1f	eventscript: remove call name from state->options Finally, we remove the call name (eg. "monitor" or "start") from the options field of the struct: it now contains only extra options. This is clearer, and mainly involves adding some %s to debug statements. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit b0648c7f08eba87ec3c9714e2525c9b621bfb4ef)	2009-11-24 11:22:46 +10:30
Rusty Russell	461f52736d	eventscript: put call type into state struct. This means we can get rid of more strcmp; they can simply use the state->call value instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 6c79fa33e26cc4f0873577f8e122b1495b4c427e)	2009-11-24 11:19:58 +10:30
Rusty Russell	205011cb61	eventscript: put call type into state struct. This means we can get rid of more strcmp; they can simply use the state->call value instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 834c93b3e1b8f4151b8a2cd82c2dd8bacc17f66c)	2009-11-24 11:19:58 +10:30
Rusty Russell	2d9254404d	eventscript: introduce enum for different event script calls. Rather than doing strcmp everywhere, pass an explicit enum around. This also subtly documents what options are available. The "options" arg is now used for extra arguments only. Unfortunately, gcc complains on empty format strings, so we make ctdb_event_script() take no varargs, and add ctdb_event_script_args(). We leave ctdb_event_script_callback() taking varargs, which means callers have to do "%s", "". For the moment, we have CTDB_EVENT_UNKNOWN for handling forced scripts from the ctdb tool. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 8001488be4f2beb25e943fe01b2afc2e8779930d)	2009-11-24 11:16:49 +10:30
Rusty Russell	e0c6e2f489	eventscript: introduce enum for different event script calls. Rather than doing strcmp everywhere, pass an explicit enum around. This also subtly documents what options are available. The "options" arg is now used for extra arguments only. Unfortunately, gcc complains on empty format strings, so we make ctdb_event_script() take no varargs, and add ctdb_event_script_args(). We leave ctdb_event_script_callback() taking varargs, which means callers have to do "%s", "". For the moment, we have CTDB_EVENT_UNKNOWN for handling forced scripts from the ctdb tool. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 470822b329f9d3ca9bef518b56e9ce28d5fedda2)	2009-11-24 11:16:49 +10:30
Rusty Russell	2763df22de	eventscript: put timeout inside ctdb_event_script_callback_v Everyone uses the same timeout value, so just remove it from the API. If we ever need variable timeouts, that might as well be central too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 533c3e053293941d2a9484b495e78d45f478bb08)	2009-11-24 11:09:46 +10:30
Rusty Russell	5dee5769d3	eventscript: put timeout inside ctdb_event_script_callback_v Everyone uses the same timeout value, so just remove it from the API. If we ever need variable timeouts, that might as well be central too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fe8027309c1f7b987cd368fa98f9b28741baa786)	2009-11-24 11:09:46 +10:30
Rusty Russell	3845c6e5b8	eventscript: cleanup ctdb_event_script_v ctdb_event_script_v doesn't take varargs. ctdb_run_event_script is a better name, and fix comment. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 466beafadb37011fe273de8810ab0012e92a1fd8)	2009-11-24 11:09:01 +10:30
Rusty Russell	1d68bb35b2	eventscript: typo cleanups 1) ctdb_event_script_v doesn't take varargs. ctdb_run_event_script is a better name, and fix comment. 2) Fix indentation on allowed_scripts. 3) Comment on run_eventscripts_callback is wrong; it's the callback for any ctdb forced event. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit e7d57d7ae678b24dab3364a348838c6a3398942c)	2009-11-24 11:08:39 +10:30
Rusty Russell	ab675516cc	eventscript: fix bug in timeouts on forced eventscripts. Again. In 15bc66ae801b0c69, Ronnie fixed a double-free race. The problem was that ctdb_run_eventscripts() hands a context to ctdb_event_script_callback() to hang its data off, which gets freed in the callback. This particularly hurt in ctdb_event_script_timeout. There's nothing wrong with this, but obviously we should make the callback call last of all. At the time, ctdb_event_script_timeout() carefully extracted everything from the struct ctdb_event_script_state before calling ->callback. This was cleaned up in 64da4402c6ad485f (Ronnie again), and now state was referred to after the callback again. But the same change introduced a direct use-after-free bug which caused an occasional oops. So in our last episode (eda052101728cf92) Volker fixed this, and Michael committed it. But we still have the double free bug which 15bc66ae801b0c69 was supposed to fix! Let's try to fix this in a more permanent way, but always doing the callback from the destructor. This means we need to hold the status, and don't send the KILL signal if ->child is set to 0. Finally, add a comment about freeing ourselves in run_eventscripts_callback and the structure definition. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit b90bdb07c1f6913ddbf11bde9684bdc8af61c549)	2009-11-24 11:06:53 +10:30
Rusty Russell	0339a83897	eventscript: fix bug in timeouts on forced eventscripts. Again. In 15bc66ae801b0c69, Ronnie fixed a double-free race. The problem was that ctdb_run_eventscripts() hands a context to ctdb_event_script_callback() to hang its data off, which gets freed in the callback. This particularly hurt in ctdb_event_script_timeout. There's nothing wrong with this, but obviously we should make the callback call last of all. At the time, ctdb_event_script_timeout() carefully extracted everything from the struct ctdb_event_script_state before calling ->callback. This was cleaned up in 64da4402c6ad485f (Ronnie again), and now state was referred to after the callback again. But the same change introduced a direct use-after-free bug which caused an occasional oops. So in our last episode (eda052101728cf92) Volker fixed this, and Michael committed it. But we still have the double free bug which 15bc66ae801b0c69 was supposed to fix! Let's try to fix this in a more permanent way, but always doing the callback from the destructor. This means we need to hold the status, and don't send the KILL signal if ->child is set to 0. Finally, add a comment about freeing ourselves in run_eventscripts_callback and the structure definition. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 20b15de068d042b292725945927ceda1b01d07c0)	2009-11-24 11:06:53 +10:30
Rusty Russell	8723045c61	eventscript: clean up forked handler event code Write the whole int through the pipe, rather than quietly cutting it off. Also, use -2 as the result if the read fails; -1 comes from many paths if the child fails before running the script. Add a comment about why we don't need to check the write. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 6804f880436645b52c09a78fa300377fa8058d0e)	2009-11-24 11:00:13 +10:30
Ronnie Sahlberg	e6b69fa760	rework and simplify the eventscript handling This version has no trailing whitespace, and fixed (This used to be ctdb commit defbe318152fc479e8076ad70433cdb4971951af)	2009-11-25 11:00:11 +10:30
Rusty Russell	b320d434b2	eventscript: clean up forked handler event code Write the whole int through the pipe, rather than quietly cutting it off. Also, use -2 as the result if the read fails; -1 comes from many paths if the child fails before running the script. Add a comment about why we don't need to check the write. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit c715746c2f40eb9b21dbf011d16f1f1b0b53fdf9)	2009-11-24 11:00:13 +10:30
Ronnie Sahlberg	eb3b787394	rework and simplify the eventscript handling (This used to be ctdb commit c5f798116bf3b7954e23c7267b056ee1f5560f45)	2009-11-24 07:40:51 +11:00
Ronnie Sahlberg	ae209c74c8	dont reset the event script context everytime we start a new "ctdb eventscript ..." command. Use the existing context used for non-monitor events Multiple concurrent uses of "ctdb eventscript ..." could otherwise lead to a SEGV (This used to be ctdb commit 80a8d728e9680040e00d24361dfc9367dd372a56)	2009-11-19 11:03:51 +11:00
Ronnie Sahlberg	93d902e8f7	test of a change to make ctdbd use "status" event instead of the "monitor" event. This allows running the actual monitoring asynchronously from ctdbd and only using "status" to pick up the actual results. (This used to be ctdb commit 1908bac812650ca25151051f5d86815e0b8ed319)	2009-11-13 12:37:55 +11:00
Volker Lendecke	1fa1830f81	Fix a segfault in the eventscript timeout handler. The state was freed too early. Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit eda052101728cf922ce892e3c53b4f37e7ceac42)	2009-11-05 11:13:53 +01:00
Ronnie Sahlberg	d379b30182	create a separate context for non-monitor eventscripts so they dont collide (This used to be ctdb commit 325de818f88f339a16dc4544e899a2d735933c44)	2009-10-28 17:35:15 +11:00
Ronnie Sahlberg	f8a8c0d6e4	return 0 in the event script callback if it was aborted by a different script (This used to be ctdb commit 8d5cb2586a1d5a0255cc18295430927b914d4527)	2009-10-28 16:40:31 +11:00
Ronnie Sahlberg	e07ca41886	change the eventscript handling to allow EventScriptTimeout for each individual script isntead of for the entire set of scripts restructure the talloc hierarchy to allow this (This used to be ctdb commit 64da4402c6ad485f1d0a604878a7b0c01a0ea5f0)	2009-10-28 16:11:54 +11:00
Ronnie Sahlberg	3526bc830d	Enhance the logging fromeventscripts. When a single script is finished, also log the name of the script, the duration it took and the return status. In the loop where we signal back to the main daemon that the script finished, do this once every 100ms instead of once every 1 second (This used to be ctdb commit 6a1f7a7b1b3a0b8f89998db8fdad83bbb4e9b5a5)	2009-10-28 09:07:43 +11:00
Ronnie Sahlberg	1d7681709b	dont run the monitor event so frequently after a event has failed. use _exit() instead of exit() when terminating an eventscript. (This used to be ctdb commit cc30ee2f4f33cb75b2be980c2d4dff6c7c23852f)	2009-10-27 13:51:45 +11:00
Ronnie Sahlberg	c61c655769	when scripts timeout, log pstree to a file in /tmp and just log the filename in the messages file (This used to be ctdb commit 0785afba8e5cd501b9e0ecb4a6a44edf43b57ab0)	2009-10-23 13:55:21 +11:00
Ronnie Sahlberg	902c476c03	From Volker L Fix some warnings and an incorrect check for a talloc failure (This used to be ctdb commit 27296a47b3d057a6729287acf128b2b67775ecde)	2009-10-22 12:19:40 +11:00
Ronnie Sahlberg	d5fd4fc0ce	During tests it is common to add/delete test eventscripts at runtime. This can race with teh eventascript handling that does a : list all scripts, sort them, then execute them so trap status code 127 which means the script could not be executed (or /bin/sh does not exist) and treat it as not to cause the node to become unhealthy (This used to be ctdb commit befabc917edb036ca81f5216f65a6d62b26ee83e)	2009-10-21 16:50:39 +11:00
Ronnie Sahlberg	a92ba7f729	lower the debug levels for the "create FD messages" so we dont fill up the logs. (This used to be ctdb commit 87146db2769c2ec494813685bf9cec0d2a6336c3)	2009-10-21 15:26:24 +11:00
Ronnie Sahlberg	d788dd3627	From wolfgang Mueller Add a tuneable so that when scripts starts to hang/timeout, we can make the node unhealthy instead of banned (This used to be ctdb commit 2e9fc6f0609833c6d8146196011ef780669d615d)	2009-10-20 12:59:48 +11:00
Ronnie Sahlberg	9de3652380	add logging everytime we create a filedescriptor in the main ctdb daemon so we can spot if there are leaks. plug two leaks for filedescriptors related to when sending ARP fail and one leak when we can not parse the local address during tcp connection establish (This used to be ctdb commit ddd089810a14efe4be6e1ff3eccaa604e4913c9e)	2009-10-15 11:24:54 +11:00
Ronnie Sahlberg	c58a6b39a6	add more debugging output to eventscripts and when a script has timed out, print a full "pstree -p" to the log. Example : \|-ctdbd(29826)-+-ctdbd(29862) \| `-ctdbd(31897)-+-00.ctdb(31898)---sleep(31908) change the default timeout to 60 seconds for eventscripts (This used to be ctdb commit a3406c10d70f89d332eab25d481083142dff987d)	2009-10-14 14:14:28 +11:00
Ronnie Sahlberg	05137e4718	Fix bug spotted by Metze, the argument to ctdb_control_event_Script_disabled() is a string not a uint32 (This used to be ctdb commit 687535b51622d1fac7ccb38fa640bf1febd69fd8)	2009-10-09 22:22:11 +11:00
Ronnie Sahlberg	f8334e2f68	we should close this file on exec (This used to be ctdb commit c1c0ebb8da9a6c29ee83868a311f07f30cb4ed16)	2009-10-02 13:41:54 +10:00
Ronnie Sahlberg	e578bed20d	dont force an election just because the ban flag differs across the cluster. a simple push to resync this flag is sufficient (This used to be ctdb commit 8903b858ddd3a016d9cf765187839814443a67ca)	2009-09-09 10:57:39 +10:00
Ronnie Sahlberg	cda5f02c7c	new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a)	2009-09-04 02:20:39 +10:00
Ronnie Sahlberg	1cc79905ad	add new controls to make it possible to enable/disable individual eventscripts update scriptstatus output so it lists disabled scripts (This used to be ctdb commit 7e799b7523c9699bd65a8a8207f7e03d668b0b81)	2009-08-13 13:04:08 +10:00
Ronnie Sahlberg	e5e9fc48b1	create a new event : stopped. This event is called when a node is stopped and is used by eventscripts that need to do certain cleanup and removal of configuration or ip addresses or routing ... Note that a STOPPED node is considered "inactive" and as such will not be running the "recovered" event when the rest of the cluster has recovered. (This used to be ctdb commit 65e9309564611bf937ded3c74a79abff895d7c59)	2009-07-17 12:26:16 +10:00
Ronnie Sahlberg	5371e3a793	lower the loglevel when we long that we skip an evenscript because it is not executable (This used to be ctdb commit c265df3c7950aab51b8b6ef17040229b97345c35)	2009-06-01 15:29:36 +10:00
Sumit Bose	2fcedf6dac	add missing checks on so far ignored return values Most of these were found during a review by Jim Meyering <meyering@redhat.com> (This used to be ctdb commit 3aee5ee1deb4a19be3bd3a4ce3abbe09de763344)	2009-05-21 11:22:21 +10:00
Ronnie Sahlberg	a87e6f56ae	we only need to switch into client mode from the eventscript child if we are running the monitor event (This used to be ctdb commit 13e2c9044950f21918e4610726e73ed3d8f76920)	2009-04-06 14:03:09 +10:00
Ronnie Sahlberg	1f87ee85bc	use _exit() and not exit() when we terminate a failed eventscript child process (This used to be ctdb commit 33b296cee177adc61edc911caec8c24b3efa8441)	2009-04-06 13:16:36 +10:00
root	629d5ee1fa	add a new command "ctdb scriptstatus" this command shows which eventscripts were executed during the last monitoring cycle and the status from each eventscript. If an eventscript timedout or returned an error we also show the output from the eventscript. Example : [root@rcn1 ctdb-git]# ./bin/ctdb scriptstatus 6 scripts were executed last monitoring cycle 00.ctdb Status:OK Duration:0.021 Mon Mar 23 19:04:32 2009 10.interface Status:OK Duration:0.048 Mon Mar 23 19:04:32 2009 20.multipathd Status:OK Duration:0.011 Mon Mar 23 19:04:33 2009 40.vsftpd Status:OK Duration:0.011 Mon Mar 23 19:04:33 2009 41.httpd Status:OK Duration:0.011 Mon Mar 23 19:04:33 2009 50.samba Status:ERROR Duration:0.057 Mon Mar 23 19:04:33 2009 OUTPUT:ERROR: Samba tcp port 445 is not responding Add a new helper function "switch_from_server_to_client()" which both the recovery daemon can use as well as in the child process we start for running the actual eventscripts. Create several new controls, both for the eventscript child process to inform the master daemon of the current status of the scripts as well as for the ctdb tool to extract this information from the runninc daemon. (This used to be ctdb commit c98f90ad61c9b1e679116fbed948ddca4111968d)	2009-03-23 19:07:45 +11:00
Ronnie Sahlberg	b9bd20ce55	add a context and a timed event so that once we have been in recovery mode for too long we drop all public ip addresses (This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0)	2008-10-22 11:04:41 +11:00
Ronnie Sahlberg	5808a7be96	allow multiple eventscripts using the same prefix. this eases the pain for users that use out of tree eventscripts (This used to be ctdb commit 8313dfb6fc5404cd2d065af6620412f8664ada11)	2008-10-16 17:57:50 +11:00
Ronnie Sahlberg	0964c59dc6	Do not allow "ctdb eventscript" to start new eventscripts while we are in recovery mode (This used to be ctdb commit 8140825e1d06053a900fd0adf0a150622c0fc146)	2008-07-17 09:04:15 +10:00
Ronnie Sahlberg	66222af5e4	Fix a very subtle race where we could get a double free of a talloced memory if ctdb_run_eventscript() would be called during processing of ctdb_event_script_timeout() for user unvoked eventscripts. (eventsccripts invoked by "ctdb eventscript ...") Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> (This used to be ctdb commit 15bc66ae801b0c69a65a7a2acf5df151e76edc2a)	2008-07-11 10:33:46 +10:00
Ronnie Sahlberg	334db8ccba	proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358)	2008-07-09 14:02:54 +10:00
Ronnie Sahlberg	522830dea8	Revert "waitpid() can block if it takes a long time before the child terminates" This reverts commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10. revert the waitpid changes. we need to waitpid for some childredn so should refactor the approach completely (This used to be ctdb commit 702ced6c2fe569c01fe96c60d0f35a7e61506a96)	2008-07-08 17:41:31 +10:00
Ronnie Sahlberg	d67de4a7d2	waitpid() can block if it takes a long time before the child terminates so we should not call it from the main daemon. 1, set SIGCHLD to SIG_DFL to make sure we ignore this signal 2, get rid of all waitpid() calls 3, change reporting of event script status code from _exit()/waitpid() to write()/read() one byte across the pipe. (This used to be ctdb commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10)	2008-07-08 03:48:11 +10:00
Ronnie Sahlberg	6bfbec28a4	use more libral handling of event scripts timing out. If the event script that timed out was for the "monitor" event, then even if it timed out we still return SUCCESS back to the guy invoking the eventscript. Only consider the eventscript for "monitor" to have failed with an error IFF it actually terminated with an error, or if it timed out 5 times in a row and hung. (This used to be ctdb commit 60f3c04bd8b20ecbe937ffed08875cdc6898b422)	2008-07-07 20:38:59 +10:00
Ronnie Sahlberg	779468ab3f	if the event scripts hangs EventScriptsBanCount consecutive times in a row the node will ban itself for the default recovery ban period (This used to be ctdb commit 7239d7ecd54037b11eddf47328a3129d281e7d4a)	2008-06-13 13:18:06 +10:00
Ronnie Sahlberg	30535c815d	when a eventscript has timed out, log the event options (i.e. "monitor" "takeip 1.2..." etc) to the log (This used to be ctdb commit dbe31581abf35fc4a32d3cbf487dd34e2b9c937a)	2008-06-13 12:18:00 +10:00
Andrew Tridgell	8ec3665231	put the return in the right place We were refusing the 'startrecovery' event (This used to be ctdb commit 788d38812d73729f11d12e9812b16092c0ae4123)	2008-05-14 22:05:09 +10:00
Andrew Tridgell	e465110f95	Fix the chicken and egg problem with ctdb/samba and a registry smb.conf This attempts to fix the problem of ctdb event scripts blocking due to attempted access to the ctdb databases during recovery. The changes are: - now only the 'shutdown' and 'startrecovery' events can be called with the databases locked in recovery. The event scripts must ensure that for these two events no database access is attempted - the recovered, takeip and releaseip events could previously be called inside a recovery. The code now ensures that this doesn't happen, delaying the events till after recovery has finished - the 50.samba event script now avoids using testparm unless it is really needed This needs extensive testing. (This used to be ctdb commit e3cdb8f2be6a44ec877efcd75c7297edb008a80b)	2008-05-14 20:57:04 +10:00
Ronnie Sahlberg	e8e67ef576	add a mechanism to force a node to run the eventscripts with arbitrary arguments ctdb eventscript "command argument argument ..." (This used to be ctdb commit 118a16e763d8332c6ce4d8b8e194775fb874c8c8)	2008-04-02 11:13:30 +11:00
Andrew Tridgell	f6e53f433b	merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)	2008-02-04 20:07:15 +11:00
Andrew Tridgell	9d6ac0cf55	added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502)	2008-02-04 17:44:24 +11:00
Ronnie Sahlberg	12ebb74838	change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)	2007-09-04 09:50:07 +10:00
Ronnie Sahlberg	794fb10634	add an extra debug statement when we send a SIGTERM to a process (This used to be ctdb commit a9c1be9cf9efdc69bfc95657b70e9f8b8230cda8)	2007-08-27 17:33:46 +10:00
Ronnie Sahlberg	b582e13cae	make sure that the event script is executable and just ignore it othervise (This used to be ctdb commit 65eb7845c70489d654acaaf99cd2c8eac7df11dc)	2007-08-21 09:22:14 +10:00
Andrew Tridgell	405e123ffb	removed redundent debug message (This used to be ctdb commit 9ee742b7cc43be7da6b568308912a3f2cfe4f4d3)	2007-08-20 11:13:38 +10:00
Andrew Tridgell	46639ac19e	merged new event script calling code from ronnnie (This used to be ctdb commit bbacad61b3eee4276ffe44ed2a23949aca8152cf)	2007-08-20 11:10:30 +10:00
Ronnie Sahlberg	7322e82bcb	add text to the event script timeout log on how to find out which script timed out (This used to be ctdb commit bd6db995fb00ed45c5f0a50bbe6cf5d0fe22a194)	2007-08-15 15:08:42 +10:00
Ronnie Sahlberg	3b9d50f3ee	change the now rather small /etc/ctdb/events script into a service specific script /etc/ctdb/events.d/00.ctdb get rid of CTDB_EVENTS_SCRIPT and --event-script (This used to be ctdb commit 81ccfaf838e5772d4a58eb6a70224b7b39aba9f3)	2007-08-15 15:01:31 +10:00
Ronnie Sahlberg	ff58f7c7ea	add a comment that the talloc_free also removes the script from the tree (This used to be ctdb commit ce71f6e9cf983cc4fe66935ad6c18d55dfed03a5)	2007-08-15 14:46:06 +10:00
Ronnie Sahlberg	4023576e50	call the service specific event scripts directly from the forked child instead for from /etc/ctdb/events so that we can get better debugging output in the logs when something fails in the scripts (This used to be ctdb commit 4ed96b768aea1611e8002f7095d3c4d12ccf77a3)	2007-08-15 14:44:03 +10:00
Ronnie Sahlberg	5a02262a06	comment that ctdb_event_script_v() is called from a forked childs context and thus can make blocking calls (This used to be ctdb commit b31d98281f15995ad340d2510e08e04ed46e271a)	2007-08-15 10:48:10 +10:00
Andrew Tridgell	689195b455	make sure we still run events when waiting for ctdb_event_script() (This used to be ctdb commit 05efbfe9ff9691c1d7441e7b9855aed25791faf0)	2007-07-19 13:36:00 +10:00
Andrew Tridgell	d2a5af7eb8	fully save/restore scheduler parameters (This used to be ctdb commit 59408eabe7515d49a6eef3b6fb2590a1cd1df956)	2007-07-13 09:35:46 +10:00
Andrew Tridgell	fc73bc5c24	added --nosetsched option to ctdbd (This used to be ctdb commit 4cbbb88c1735c7d112e751e22da1c1c69e09bf4a)	2007-07-13 08:47:02 +10:00
Andrew Tridgell	32de198fd3	update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)	2007-07-10 15:29:31 +10:00
Andrew Tridgell	006227e80a	forgot to add this (This used to be ctdb commit 30fc56b7489e42633532964096e53faee1319dde)	2007-07-04 17:45:46 +10:00

1 2 3 4 5

215 Commits