samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-27 03:21:53 +03:00

Author	SHA1	Message	Date
Martin Schwenke	bc4e62be85	Eventscripts - call ctdb_check_args() instead of doing hand checking Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cc5bc1948dcbe8b8b25185260927b94a4b529174)	2011-08-30 09:33:47 +10:00
Martin Schwenke	7980a4cb44	Eventscripts - new function ctdb_check_args() Pass this "$@" to do common eventscript argument checking. For regular use putting this in 00.ctdb would be enough. However, for developer testing it can be useful to call this in other eventscripts. For example, 10.interfaces and 13.per_ip_routing currently check these by hand. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 36de7e7fd6dfeed61ef9977b8d5b568f90a9707b)	2011-08-30 09:33:47 +10:00
Martin Schwenke	63729fc35d	Eventscripts - ctdb_check_tcp_ports() bug fix. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8d9c0b251c84d6fdf6ea7d972e5f7d1d0222f9b)	2011-08-30 09:33:47 +10:00
Martin Schwenke	194de8faf8	Eventscripts - fix debugging buglet in ctdb_check_tcp_ports_ctdb() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 61000e38d6016e58f67e292393756d0bd5262ae5)	2011-08-30 09:33:47 +10:00
Martin Schwenke	9257b57f2c	Eventscripts: New configuration variable CTDB_SERVICE_AUTOSTARTSTOP. Some of the current auto-start/stop logic is broken, particularly for Samba. Fixing it is non-trivial. If $CTDB_SERVICE_AUTOSTARTSTOP is "yes" then auto-start/stop services when told to newly manage or no longer manage them. This defaults to "yes". However, if using a canned configuration file that doesn't set $CTDB_SERVICE_AUTOSTARTSTOP then this stops the auto-start-stop logic from working. Therefore, this works around CQ S1026685 - on the system in question another daemon controls service auto-start/stop and CTDB just gets in the way. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ef71b8290ae49117d7bcc7166598b77cb64cc8a0)	2011-08-30 09:33:47 +10:00
Martin Schwenke	54402cdff4	Eventscripts - in 60.nfs uniquify the share check directory list There are sites that have multiple entries for the same export. This optimises the share check in this case. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 1ccdae79b64b236fc27f4653606429d73c9c3595)	2011-08-30 09:33:47 +10:00
Ronnie Sahlberg	02ebd35398	Merge remote branch 'martins/eventscripts' (This used to be ctdb commit bb008c01989ebb173a3f095ebd2f90ab54f9da91)	2011-08-17 14:10:04 +10:00
Martin Schwenke	6e7dbf0543	Eventscripts - new default TCP port checker using "ctdb checktcpport" New function ctdb_check_tcp_ports_ctdb(). This should be fast... and is now the default checker. If it fails in an unexpected way we fall back to the nmap and netstat checkers. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a1e16a707ce204817531a61455000361f972080a)	2011-08-17 14:02:45 +10:00
Martin Schwenke	1374327f6e	Eventscripts - generalise TCP port checking plus new nmap-based checker Split the netstat-specific parts of ctdb_check_tcp_ports() into new function ctdb_check_tcp_ports_netstat(). Implement new ctdb_check_tcp_ports_nmap() function that uses "nmap -PS" to check if the desired ports are listening. ctdb_check_ctdb_ports() now uses new configuration variable CTDB_TCP_PORT_CHECKERS to decide which port checkers to try. Default value is currently "nmap netstat". If nmap is not found then this will fall back to netstat - if logging is at debug level this will also fill the logs with message saying the nmap checker failed. This indicates that either nmap should be installed or the default value of CTDB_TCP_PORT_CHECKERS should be changed (in a configuration file) to avoid trying to use nmap. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9651175b40b9454e7d4e98291955fcf1445085e)	2011-08-17 12:12:20 +10:00
Martin Schwenke	62f654d3d2	Eventscripts - ctdb_check_tcp_ports() only prints netstat output if debugging Use the new debug function to conditionally print the netstat output. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 44c14aeeb11080980fe07c7396d06843a4870747)	2011-08-17 10:39:54 +10:00
Martin Schwenke	86792724a2	Eventscripts - weaken TCP port check message if CTDB has just been started. Sometimes smbd and other services can take a while to start, especially when there is a lot of activity after ctdbd has just started. The TCP port check can then pollute the logs with lots of "ERROR" messages and possibly extra debug. This creates a flag file when a service is started (but not restarted) and this flag is removed the first time that TCP port checks succeed for that service. When a port check fails and the flag file still exists, a less extreme "INFO" message is printed rather than the usual "ERROR" message. This means that until the node actually becomes healthy we see more friendly messages. The subtext is that we're hearing false positive reports "recreates" of CQ S1024874 (samba stopped responding on port 445) quite often when ctdbd is started. This reduces the chances of people reporting such false recreates... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 571865eb6ef847857129d0b1e2ba5fa7254bfe8c)	2011-08-17 10:39:53 +10:00
Martin Schwenke	5c9fbb55ce	Eventscript functions: optimise ctdb_check_tcp_ports() and add debug. ctdb_check_tcp_ports() runs "netstat -a -t -n" in a loop for each port. There are 2 problems with this: * Netstat is run on each loop iteration when it need only be run once. * The -a option is used to list all connections but the function only cares about the listening ports. There may be many thousands of non-listening ports to grep through. This changes ctdb_check_tcp_ports() to run netstat with the -l option instead of the -a option. It also only runs netstat once before the main loop. When a port is found to not be listening the output of the netstat command is now dumped to help with debugging. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 830355a8b18c53cfcc3ad1e3009bbb1a7a681fa0)	2011-08-17 10:39:53 +10:00
Martin Schwenke	f0f9271301	Eventscripts: add a debug() function and call ctdb_set_current_debuglevel() The debug function passes its arguments to echo if $CTDB_CURRENT_DEBUGLEVEL is >= 4 (i.e. DEBUG). If no args are given then use stdin - this allows the function to be used with here documents. To ensure $CTDB_CURRENT_DEBUGLEVEL is set, ctdb_set_current_debuglevel() is called near the end of the functions file. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 6143483d9f87322578c00f12081e381f425226ca)	2011-08-17 10:39:35 +10:00
Ronnie Sahlberg	ce4555b7a6	dont use a too big persistence timeout value (This used to be ctdb commit 82628e32c431d66b806399ffb9657c3a031f6428)	2011-08-17 10:00:06 +10:00
Martin Schwenke	3e1a0528b8	Eventscripts - conditionally inherit ctdbd debug level in each monitor event Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a7eebc06f81a7b0a3fba93759bcbdeabc8c2e86e)	2011-08-17 09:14:23 +10:00
Martin Schwenke	171bef3d68	Eventscripts - new function ctdb_set_current_debuglevel() This function ensures that CTDB_CURRENT_DEBUGLEVEL is set. It works like this: 1. If it is already set then do nothing, since it might have been set some other way. The recommended "other way" would be to add a file in rc.local.d/. 2. If it is not set then set it by sourcing /var/ctdb/eventscript_debuglevel. 3. If this file does not exist then create it using output from "ctdb getdebug". If the optional 1st argument is set to "create" then don't source an existing file but create a new one instead - this is useful for creating the file just once in each event run in, say, 00.ctdb. If there's a problem getting the debug level from ctdb then it is silently set to 0 - no use spamming logs if our debug code is broken... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 93910921c8a25f2b029733cd938069ff7c7bdab7)	2011-08-17 09:00:46 +10:00
Martin Schwenke	430ca2f606	Eventscripts - ensure the statd update-trigger file always exists. See the comment in the code for details. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8ee9856996a8ec738e9d3ea7f1561605da526b8c)	2011-08-16 13:28:40 +10:00
Martin Schwenke	1452b63d27	Eventscripts: remove "return 0" from 50.samba service_stop(). This potentially masks errors and was basically included by accident. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e7e4a1b4f31118027fd13a6223192f9957cf2e74)	2011-08-16 13:18:40 +10:00
Ronnie Sahlberg	81292ac0e6	Change the errors for 10.interface to clearly state ERROR: for error messages Update the tests system to catch the new error strings generated by this change (This used to be ctdb commit a2c30d88348da47d1a733a16e4c7d83c3becb6df)	2011-08-15 15:53:04 +10:00
Ronnie Sahlberg	1fb577f4b2	Merge remote branch 'martins/eventscript.10.interface' (This used to be ctdb commit 0d17daab38d4086f922a8006d4c545133adca191)	2011-08-15 15:27:50 +10:00
Ronnie Sahlberg	bc00292cfe	Merge remote branch 'martins/60_nfs_regression' (This used to be ctdb commit 845fb0ba24cf9118470c58fae7103ab8322ce079)	2011-08-15 15:22:20 +10:00
Martin Schwenke	c9d168bbe4	Eventscripts: 10.interfaces - make startup event actually mark interfaces up! The startup event intends to mark interfaces up. However, it doesn't actually do that because $INTERFACES is empty. This uses the function get_all_interfaces() to list the interfaces... and then mark them up. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fc62bf0975c6059ee467285565d0dc3b4daaf238)	2011-08-12 16:34:34 +10:00
Martin Schwenke	5ab955a73d	Eventscripts: 10.interfaces - startup comment says assume all interfaces good. Interfaces are currently marked down. Mark them up instead, as per the comment... and discussion with Ronnie. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 35942841229cc72ce363a7236aec708f1a33136b)	2011-08-12 16:34:34 +10:00
Martin Schwenke	e7963d8a65	Eventscripts: 10.interfaces - new function get_all_interfaces(). Move existing interface listing code to new function in preparation for using it in startup event. While we're here change the "sort \| uniq" into "sort -u" and save some complexity. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cd1442531ad079b11c60f46ee9d34f5104bef219)	2011-08-12 16:34:34 +10:00
Martin Schwenke	9bdcdb76be	Eventscripts: 10.interface clean-ups - minor tweaks and new comments. * sed can read files, it doesn't need a file piped to it * use $() subshells instead of `` - they seem to quote better in dash * tweak the uniquifying code so that it is easier to read * add comments * remove some extraneous semicolons at ends of lines Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5f49537889a92c3cb68d9203912188bedf00ecd4)	2011-08-12 16:34:13 +10:00
Martin Schwenke	32fe247e37	Eventscripts: In 60.nfs don't restart NFS when restarting rpc.lockd. This effectively reverts 953dbfbddad656a64e30a6aca115cb1479d11573 and is a policy decision. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 380c9263eb37db5a250264316e250c2160908263)	2011-08-12 16:28:09 +10:00
Martin Schwenke	7c33fb1711	Eventscripts: 10.interface clean-ups - variable name fix-ups. Change most of the uppercase variable names to lowercase for consistency with other variables, readability and so they can be easily distinguished from environment/configuration variables. Change the name of 2 of the variabless to add some clarity. Changes are as follows: INTERFACES -> all_interfaces IFACES -> ctdb_interfaces IFACE -> iface I -> i REALIFACE -> realiface Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7b201c1087b1433cfbc95de76cb4205e484ccd6f)	2011-08-12 15:57:34 +10:00
Martin Schwenke	6fa27bdf18	Eventscripts: 10.interfaces clean-ups - push logic into monitor_interfaces(). The logic in the monitor event itself is very complex. Nearly all of it can go away by adding a single check of $CTDB_PARTIALLY_ONLINE_INTERFACES to the return logic of monitor_interfaces() and reversing the sense of the corresponding check. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fa93177442c65c2a4eb2d5d5dba0a0da1c486969)	2011-08-12 15:00:03 +10:00
Martin Schwenke	00c4cc6d22	Eventscripts: 10.interfaces clean-up - use more descriptive variable names. The name of variable $ok gives no clue to its meaning/use so this changes that variable to be named $up_interfaces_found. The return logic relating to $ok and $fail is difficult to read, so these variables are given true/fale values, allowing the return logic to be simplified. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3402930319d462eab5525410f6a676952e120182)	2011-08-12 14:49:27 +10:00
Martin Schwenke	bb5db84021	Eventscripts: 10.interfaces cleanup - new functions mark_up(), mark_down(). The same few lines of logic are used every time an interface up or down. This encapsulates those few lines in 2 new functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ab443c4d7d282f282792abc6a6ac224ab06abe30)	2011-08-12 14:43:15 +10:00
Martin Schwenke	1d71dd08e3	Eventscripts: change failure counts and behaviour for statd and nfsd. We reduce the number of failures before attempting a restart. However, after 6 failures we mark the cluster unhealthy and no longer try to restart. If the previous 2 attempts didn't work then there isn't any use in bogging the system down with an attempted restart on every monitor event. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f654739080b40b7ac1b7f998cacc689d3d4e3193)	2011-08-12 14:16:17 +10:00
Martin Schwenke	398116ff29	Eventscripts: clean up 60.nfs monitor event. This adds a helper function called nfs_check_rpc_service() and uses it to make the monitor event much more readable. An example of usage is as follows: nfs_check_rpc_service "mountd" \ -ge 10 "verbose restart:b unhealthy" \ -eq 5 "restart:b" The first argument to nfs_check_rpc_service() is the name of the RPC service to be checked. The RPC service corresponding to this command is checked for availability using the rpcinfo command. If the service is available then the function succeeds and subsequent arguments are ignored. If the rpcinfo check fails then a failure counter for that particular RPC service is incremented and subsequent arguments are processed in groups of 3: 1. An integer comparison operator supported by test. 2. An integer failure limit. 3. An action string. The value of the failure counter is checked using (1) and (2) above. The first check that succeeds has its action string processed - note that this explains the somewhat curious reverse ordering of checks. It the example above: * If the counter is >= 10 then a verbose message is printed describing the failure, the service is restarted in the background and the node is marked as unhealthy (via an "exit 1" from the function). * If the counter is == 5 then the service us restarted in the background. For more action options please see the code. This also changes the ctdb_check_rpc() function so that it no longer takes a program number to check. It now just takes a real RPC program name that rpcinfo can resolve via /etc/rpc. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9b66057964756a6245bafb436eb6106fb6a2866e)	2011-08-12 14:16:14 +10:00
Martin Schwenke	1971336200	Eventscripts: fix regression in 60.nfs export checking. Commit 35a60a63a9b5c7d98dde514ae552239506b691c9 introduced a regression, reported by "Jonathan Buzzard" <J.Buzzard@dundee.ac.uk>, as follows: Basically the use of sed in the following code snippet does not work for long exports where exportfs wraps the host or network onto the next line. exportfs \| grep -v '^#' \| grep '^/' \| sed -e 's/[[:space:]][^[:space:]]$//' \| ctdb_check_directories The result is that the you get lots of blank lines being sent to ctdb_check_directories which causes the host to be marked as unhealthy and then thrashing sets in of the managed IP's making the whole cluster unusable. This tightens up the sed expression so that it is less likely to produce a spurious empty line. It also removes an unnecessary "grep -v". Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit bd39b91ad12fd05271a7fced0e6f9d8c4eba92e6)	2011-08-11 15:01:39 +10:00
Ronnie Sahlberg	f9e58b502f	Merge remote branch 'martins/eventscript.10.interface' (This used to be ctdb commit 84ac667af408816e5508719b9fdb7c5e25408640)	2011-08-11 14:15:22 +10:00
Ronnie Sahlberg	b77a78d809	Merge remote branch 'martins/eventscript_infrastructure' (This used to be ctdb commit 20864822372b6d574c545287002a429b273c4bcc)	2011-08-11 14:01:02 +10:00
Martin Schwenke	088620b026	Eventscripts: in 60.nfs move statd-notify code to service_reconfigure(). This means that it now occurs on every reconfigure event. As a result the ipreallocated event is removed. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c45a89418ba733ff91d48340d72bdb6d2ef80051)	2011-08-11 13:56:25 +10:00
Martin Schwenke	eef89f83b2	Eventscripts - 60.nfs should define service_reconfigure(). Not $service_reconfigure. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 642292d7ba7a95567964b4160c7ee31a4f8985d1)	2011-08-11 13:55:02 +10:00
Ronnie Sahlberg	53b956fee7	When starting and stopping ctdb through the init-script, make sure we first clear all public ips bvefore we start the daemon, in case they are still hanging around since a previous kill -9 and also make sure we drop them after we have stopped the deamon when shutting down CQ S1027550 (This used to be ctdb commit 8de5513b3ad89711da845c7588d35b32e2f2acb6)	2011-08-11 11:48:04 +10:00
Martin Schwenke	3a760b09ed	Evenscripts: improvements to ctdb_service_check_reconfigure(). * Make this function applicable to "ipreallocated" event too. * Monitor event should not always succeed just because we reconfigure. If the service was unhealthy before the reconfigure and we end the reconfigure with "exit 0" then we can cause the node's health status to flip-flop. To avoid this we return the status of the service from the previous monitor event. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 21dfcbbdccd906fcd6ab7bba81418ce565bf63aa)	2011-08-11 10:46:57 +10:00
Martin Schwenke	e66a1af9b3	Eventscripts: 50.samba - only start/stop nmbd if $CTDB_SERVICE_NMB set. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit defaec99df8c279d8e315d5010f9146e013afda2)	2011-08-11 10:46:57 +10:00
Martin Schwenke	8fb04d451e	Eventscripts: 50.samba needs null service_reconfigure() function. Samba doesn't need to do anything for configuration changes. It will notice configuration changes and reload automatically. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit de13350c17261032a7468c2cf4d2cf4a8d66a840)	2011-08-11 10:46:57 +10:00
Martin Schwenke	b01d99a8fa	Eventscripts: 40.vsftpd service_stop() no longer /dev/null's output. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f928c201b6d0e1cd3e5568ae65186e3cee7c4988)	2011-08-11 10:46:57 +10:00
Martin Schwenke	1ea3616dcc	Eventscripts: improvements to 41.httpd. * Reduce the failure counts so that restart attempts happen sooner. * Use service_start() and service_stop() for the restart. ctdb_service_start() resets the failure count, which isn't very useful in this context. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 01776b9f29af9ad5c8534649ece1bd100e450434)	2011-08-11 10:46:56 +10:00
Martin Schwenke	2a14f91722	Eventscript functions: new function ctdb_check_counter(). This should eventually be able to replace ctdb_check_counter_limit() and ctdb_check_counter_equal(), although it doesn't issue warnings like the former. It takes 4 optional arguments: 1. _msg - If "error" then over limit causes an error message and and exit 1. Anything else fails silently but the function returns 1. Default is "error". 2. _op - An integer operator supported by test (e.g. -eq, -ge, -gt). Default is -ge. 3. _limit - Limit for the counter to be used in comparison. Default is $service_fail_limit. 4. _service_name - Used to identify the counter. Default is $service_name. For example: ctdb_check_counter error -ge 5 foo will print a message and exit 1 if the counter for foo is >= 5, whereas ctdb_check_counter check -ge 5 foo will just return 1 if the counter for foo is >= 5, and ctdb_counter_check with print a message and exit 1 if the counter for $service_name is >= $service_fail_limit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5b01b7233515669e995e037205796e265643b176)	2011-08-11 10:46:56 +10:00
Martin Schwenke	219c6fd55b	Eventscripts: remove unused remove_ip() function. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 881af7c1417962b9b3ade6565b3e8eb9f9df7a97)	2011-08-11 10:46:56 +10:00
Martin Schwenke	5c948528b5	Eventscripts: startstop_nfs stop no longer redirects output to /dev/null. When stopping (as opposed to restarting) it is useful to see this information. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a9ab1937239761dc32b143c9d225447bc6f090b4)	2011-08-11 10:46:56 +10:00
Martin Schwenke	caee6f1508	Eventscripts: fix typo in _ctdb_counter_common(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f57d1722b6aa082f3f826171acc57d7d796ea95c)	2011-08-11 10:46:56 +10:00
Martin Schwenke	ab693dbcc0	Eventscripts: improve log messages in ctdb_start_stop_service(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 6da7095192fb172a06b434cfb02f4bfa6221b343)	2011-08-11 10:46:56 +10:00
Martin Schwenke	1b956b2b0a	Eventscript functions: fix counter regression. d362be7d32079ac1390d67056ce107bfbca2c937 wasn't well thought out. Subsequent commits depend on ctdb_counter_init() taking an argument, so this makes those cases work. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 05a8fcfbac3da2b5843b31e0fe258255cc761190)	2011-08-11 10:46:56 +10:00
Martin Schwenke	217edfa1c8	Eventscript functions: ctdb_service_check-reconfigure() acts only on monitor. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit beabf506a5eb68fc50fdbf8772c1d2bb0f7951e3)	2011-08-11 10:46:56 +10:00

1 2 3 4 5 ...

541 Commits