samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-03-22 02:50:28 +03:00

Author	SHA1	Message	Date
Martin Schwenke	45878d4363	eventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADS If some nfsd threads are still alive after a shutdown during a restart then this indicates the maximum number of threads for which a stack trace should be dumped. This can be useful for trying to determine why nfsd is stuck. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2503245db10d567af708a04edd3a3b488c24f401)	2013-06-14 15:15:06 +10:00
Martin Schwenke	f408caea2a	eventscripts: Add new option $CTDB_MONITOR_NFS_THREAD_COUNT Consider the following example: 1. There are 256 nfsd threads configured. 2. 200 threads are "stuck" in system calls, perhaps waiting for the underlying filesystem when an attempt is made to restart NFS. 3. 56 threads exit when NFS is stopped. 4. 56 new threads are started when NFS is started. 5. 200 "stuck" threads exit leaving only 56 threads running. Setting this option to "yes" makes the 60.nfs monitor event look for this situation and try to correct it. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 99b0d8b8ecc36dfc493775b9ebced54539c182d2)	2013-06-13 20:01:22 +10:00
Martin Schwenke	2e515f2306	eventscripts: Fix statd-callout update handling 60.nfs and 60.ganesha touch $statd_update_trigger every time they're run. This stops the statd-callout updates from ever being called. Make this logic self-contained and move it to new function nfs_statd_update() in the functions file. Call this in 60.nfs and 60.ganesha with the appropriate update period as the only argument. Signed-off-by: Martin Schwenke <martin@meltin.net> Reported-by: Poornima Gupte <poornima.gupte@in.ibm.com> (This used to be ctdb commit 1b5968f6be084590667f4f15ff3bef13ed9a2973)	2013-05-28 16:11:47 +10:00
Martin Schwenke	1eab9c898c	eventscripts: Stop NAT gateway's delete_all() from polluting the log Every time a node that wasn't the NAT gateway master gets reconfigured something like this appears in the log: ctdbd: 11.natgw: Failed to del 10.0.1.139 on dev eth1 Since this usually fails it is better to mute the error than to have it pollute the log. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0ca7a98ffef50cbd06849cfbf65fb4a3d668b7bd)	2013-05-27 15:15:25 +10:00
Martin Schwenke	66019e3287	scripts: Provide mktemp function for platforms without mktemp command This is needed for AIX and possibly others. Also provide a cheaper mktemp function is needed in the run_tests script. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b2b572e9049c7138bd223226475bef8fe3e01f10)	2013-05-27 15:14:33 +10:00
Martin Schwenke	a989a299d1	eventscripts: 11.natgw should not call ctdb tool in "init" event The current code calls "ctdb setnatgwstate ..." on every event. However, calling the ctdb tool in the "init" event is not permitted. Instead, update the capability when it is needed and at regular intervals via the "monitor" event. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 39a43feae7c7de07ddaf2d6cb962f923d47d0c19)	2013-05-24 14:08:07 +10:00
Martin Schwenke	6d9667f01c	ctdbd: Add new runstate CTDB_RUNSTATE_FIRST_RECOVERY This adds more serialisation to the startup, ensuring that the "startup" event runs after everything to do with the first recovery (including the "recovered" event). Given that it now takes longer to get to the "startup" state, the initscript needs to wait until ctdbd gets to "first_recovery". Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7)	2013-05-24 14:08:07 +10:00
Martin Schwenke	b5ebff6931	tools/ctdb: "ctdb runstate" now accepts optional expected run state arguments If one or more run states are specified then "ctdb runstate" succeeds only if ctdbd is in one of those run states. At the moment, if the "setup" event fails then the initscript succeeds but ctdbd exits almost immediately. This behaviour isn't very friendly. The initscript now waits until ctdbd is in "startup" or "running" run state via the use of "ctdb runstate startup running", meaning that ctdbd has successfully passed the "setup" event. The "setup" event code in 00.ctdb now waits until ctdbd is in the "setup" run state before proceeding via the use of "ctdb runstate setup". Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4a2effcc455be67ff4a779a59ca81ba584312cd6)	2013-05-24 14:08:07 +10:00
Martin Schwenke	bb39f0a186	scripts: Rework notify.sh to use notify.d/ directory This makes it easier to add notification handlers. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d29e9a420b133088bf23a847c8d1dbce56c25eb0)	2013-05-23 16:18:23 +10:00
Martin Schwenke	51dbaecb54	eventscripts: Fix regression in _loadconfig() fff88940f71058e4eefd65f50a6701389c005c17 introduced a regression. Without $service_name set by default, the CTDB configuration is no longer loaded when loadconfig() is called without any arguments. That's bad. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f1619a36c1beba11533052dc5728fa3adaa08870)	2013-05-22 14:24:21 +10:00
Martin Schwenke	ff9831f5b1	initscript: If CTDB doesn't become ready, print a message before killing Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e6b6b793f61556c21e8daf34abf89ee7b388ecfb)	2013-05-22 14:24:21 +10:00
Amitay Isaacs	84bcb95952	eventscripts: Do not use bashism for string comparison Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit b0cae7d5a00ef3764bae187affc8e9a252f4b329)	2013-05-20 19:47:10 +10:00
Martin Schwenke	de84c1fd3c	eventscripts: NFS RPC checks no longer support "knfsd" No longer used, support removed from test infrastructure. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0eb351ff4c7ee096de7c5e0a59561067091fa32e)	2013-05-07 12:55:09 +10:00
Martin Schwenke	434f9e8594	eventscripts: 60.nfs uses nfs_check_rpc_services() to check NFS RPC services * New directory nfs-rpc-checks.d/ replaces hardcoded rules in 60.nfs * Installation and packaging additions to handle nfs-rpc-checks.d/ * Unit test updates, including deleting 1 test that sanity checked test infrastructure * Test infrastructure changes to use nfs-rpc-checks.d/ Note that this removes support for $CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK in 60.nfs. To get the equivalent behaviour, edit 20.nfsd.check and remove/comment all lines. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7e792d6768d9ca420ce3713cb122e63afd594b15)	2013-05-07 12:55:09 +10:00
Martin Schwenke	05b2edeec2	eventscripts: NFS RPC checks allows "nfsd" in addition to "knfsd" Want nfs_check_rpc_services() to support filenames without the 'k'. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9775fcbd6e30eef8382bea68e2f9bad2309f2c1)	2013-05-06 20:40:58 +10:00
Martin Schwenke	c52183c055	eventscripts: New function nfs_check_rpc_services() This is intended to replace nfs_check_rpc_service(), which builds configuration into eventscripts. nfs_check_rpc_services() uses a directory of configuration checks that can be edited by an administrator. The files have one limit check and a set of actions per line. The program name is extracted from the file name. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9bc8fbee6550ed2814fb35c70d57fab21ef1b8fd)	2013-05-06 20:40:58 +10:00
Martin Schwenke	167acd1cd5	eventscripts: nfs_check_rpc_action() should be _nfs_check_rpc_action() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5a717fd495ba5a2bfd481d69f38b68fa4576716f)	2013-05-06 20:40:58 +10:00
Martin Schwenke	bdab9d1ea6	eventscripts: Factor out common code from nfs_check_rpc_service() This creates new function _nfs_check_rpc_common(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cc3bb42e48bbdabd19187c231846b98589b4f4f3)	2013-05-06 20:40:58 +10:00
Martin Schwenke	910e138cb3	eventscripts: Remove ganesha support from nfs_check_rpc_service() This is unused so doesn't need to be maintained. An attempt to use it now will explicitly fail rather than implicitly fail via bitrot. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 887733dd7be53158bfe07b30ef31b611d0f8122f)	2013-05-06 20:40:58 +10:00
Martin Schwenke	944d063a3e	Revert "Eventscript functions: add optional version to nfs_check_rpc_service()" This reverts commit 92f74fd589467b46c758e116e97417edfe8773d7. This change is unused and is just complicating the function. Conflicts: config/functions (This used to be ctdb commit 77302dbfd85754e02559eccb2dd6c090db0b6b9f)	2013-05-06 20:40:58 +10:00
Martin Schwenke	577a3cae5d	eventscripts: Move rpc.statd existence check into nfs_check_rpc_service () The code in 60.nfs is going to be genericised, so make all the checks look the same. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 15b0f78cbf8d6ba481b7eba9e4fe3f4270214c72)	2013-05-06 20:40:58 +10:00
Martin Schwenke	6c347a5294	eventscripts: Factor NFS RPC check action code into nfs_check_rpc_action() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4b4e7d8f0e8dcbab987e374d06ffaa21c06da0d3)	2013-05-06 20:40:58 +10:00
Martin Schwenke	2bc807f974	eventscripts: Remove unused function ctdb_check_counter_limit() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a8ef00608e48a551a334aded206146807aeb4c5a)	2013-05-06 16:24:59 +10:00
Martin Schwenke	460d0651b6	eventscripts: Use ctdb_check_counter() instead of ctdb_check_counter_limit() ctdb_check_counter_limit() can soon be removed... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit bb2cdff77e8ec79e7d319159b9c9848ecfaaa0f1)	2013-05-06 16:24:59 +10:00
Martin Schwenke	8373226251	eventscripts: Might as well try to stat the reclock file first It is in the background but it still might cause the counter to be reset before it is checked. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ef2cf75e95ff382c65524a4d77eb00ab8411d2fc)	2013-05-06 16:24:58 +10:00
Martin Schwenke	31c3edcadf	eventscripts: Make the early exit in 01.reclock earlier That way we don't even check the counter... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 136abd4604dc68f7c696704bac708bae53cf1940)	2013-05-06 16:24:58 +10:00
Martin Schwenke	29a3823e40	eventscripts: Minor cleanups for killtcp/tickle functions Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25ef4f655f1efc833deb5e244f9fff461e92f439)	2013-05-06 16:24:50 +10:00
Martin Schwenke	189a5c003c	eventscripts: Tweak the timeout check in kill_tcp_connections() This has 2 advantages: 1. It uses get_tcp_connections_for_ip() to check for leftover connections, instead of custom code. 2. It checks for the timeout condition before sleeping. The current code sleeps and then checks, so wastes a second. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 60a08eb96e1d97aab31e9bd4af01683c650541c2)	2013-05-06 16:22:15 +10:00
Martin Schwenke	8f84a2bec7	eventscripts: In killtcp/tickle functions, $_failed should be boolean Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 319c1b68d5aa78f82a68febcad233a7c78afc887)	2013-05-06 16:22:07 +10:00
Martin Schwenke	ed59deaee3	eventscripts: Remove unused $_killcount from tickle_tcp_connections() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8514ca56830b30e7f0eb5018632640daaf8ff65d)	2013-05-06 16:16:56 +10:00
Martin Schwenke	975ea7fb7a	eventscripts: Refactor connection listing in killtcp and tickle functions Uses new function get_tcp_connections_for_ip(). This avoids using a temporary file and running netstat twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a621622903c7ef17764b15293d6ea8df5a53c7e1)	2013-05-06 16:16:50 +10:00
Martin Schwenke	a320e1f7f1	eventscripts: Reimplement kill_tcp_connections_local_only() ... using kill_tcp_connections() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 10e4db8f796d1e3259733180494db3b4bbad291a)	2013-05-06 15:45:11 +10:00
Martin Schwenke	5e828b48fe	eventscripts: Change handling of one-way kills in kill_tcp_connections() This change is a no-op. However, In a subsequent commit we'll merge kill_tcp_connections_local_only() with this function. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 23c0f5f48e3e5a0c1a3254c582299f7893cf0d33)	2013-05-06 15:45:10 +10:00
Martin Schwenke	d98d931af3	eventscripts: Remove unnecessary variables from killtcp/tickle functions Setting these variables spawns lots of unnecessary processes, which would surely slow down these functions on a busy system. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3eae161472e6352f7f656851c73dc056f95113eb)	2013-05-06 15:45:10 +10:00
Martin Schwenke	6e2863a4f9	eventscripts: Clean up ctdb_check_command() * Command is now multiple arguments, preserving quoting * $service_name no longer printed, no longer an argument * Debug output from failed command Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9e25fb261447a196de05937052779b36e75e7215)	2013-05-06 15:45:10 +10:00
Martin Schwenke	30addb886a	eventscripts; Cleanup up ctdb_check_directories() The documentation comments are wrong... and remove option $service_name argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9e6cb945c5edac9ca6405c9228bf647fab814f5)	2013-05-06 15:45:10 +10:00
Martin Schwenke	0ad8f46db3	eventscripts: Assert that $service_name is set in a few key places Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d0a7d83ddc824961d876fc9afba829c90aef3e7)	2013-05-06 15:45:10 +10:00
Martin Schwenke	5dd9e52e46	eventscripts: counters default to $script_name if $service_name not set Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fff88940f71058e4eefd65f50a6701389c005c17)	2013-05-06 15:45:10 +10:00
Martin Schwenke	e9abc9c070	eventscripts: Simplify handling of $service name in "managed" functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. $service_name is no automatically longer set in the functions file. This means it needs to be explicitly set in 13.per_ip_routing because this script uses ctdb_service_check_reconfigure(). Eventscript unit test infrastructure needs to set $service_name during fake service setup, and policy routing tests need to be updated accordingly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 27aab8783898a50da8c4bc887b512d8f0c0d842c)	2013-05-06 15:45:10 +10:00
Martin Schwenke	c56acf7127	eventscripts: Simplify handling of $service name in start/stop functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b5802c4735e1c719a5cf9ce69489d5947bd5e8c5)	2013-05-06 15:45:10 +10:00
Martin Schwenke	8065366b33	eventscripts: Simplify handling of $service name in service_management Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e24baac0d2952e86d5ff31235901f06e2f2b2449)	2013-05-06 15:45:10 +10:00
Martin Schwenke	4c9438b2a3	eventscripts: Simplify handling of $service name in reconfigure functions Complicated argument handling was introduced to deal with multiple services per eventscript. This was a failure and we split 50.samba. This simplifies several functions to use global $service_name unconditionally instead of having an optional argument. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2ea72ff565222f9edab408638bd45dbba6e8ff7)	2013-05-06 15:45:10 +10:00
Martin Schwenke	642848b916	eventscripts: Remove unused function ctdb_check_counter_equal() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fd536a26b310b5bf9628da62cca0b425f4a54030)	2013-05-06 15:45:10 +10:00
Martin Schwenke	bbd0ed0e29	scripts: Fix script_log() regression 5940a2494e9e43a83f2bca098bd04dfc1a8f2e93 makes script_log() always pass a message to logger, so script_log() can no longer log stdin. Put all the tag fu in the actual tag so the message argument is empty if no message was passed. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9dee4c84273633b9ad82e94dabbf0e6f86edbcef)	2013-05-06 15:43:16 +10:00
Martin Schwenke	27a5b78c8e	initscript: Look for tdbtool/tdbdump using which, not in fixed locations Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c74cc0442eb90d859eae270b59456d28605817c4)	2013-05-06 15:40:30 +10:00
Martin Schwenke	fa16cccf02	ctdbd: Remove the "stopped" event It isn't used, superceded by "ipreallocated". Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2bb8596a8af6406ef50e53953884df9d6246a96)	2013-05-06 13:38:21 +10:00
Martin Schwenke	fb028a208c	eventscripts: Remove use of "stopped" event Use "ipreallocated" instead. The "stopped" event pre-dates the "ipreallocated" event. The only way of stopping a node is via the ctdb tool, which explicitly causes a takeover run to occur after the node is stopped. The takeover run will generate an "ipreallocated" event. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 978d4a0d6d8c9877b23f72e3a7b78c1245d16908)	2013-05-06 13:38:21 +10:00
Martin Schwenke	823edbf6fe	scripts: Ensure even external scripts get tagged in logs as "ctdbd" Our practice is to search logs for "ctdbd:". We want to make sure we find everything. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5940a2494e9e43a83f2bca098bd04dfc1a8f2e93)	2013-04-22 13:58:36 +10:00
Martin Schwenke	fb8be43d6d	eventscripts: Ensure directories are created Previous commits stopped the top level of the script from creating certain directories but some functions assume that required directories exist. Create those directories instead. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0076cfc4666e5a96eb2c8affb59585b090840e00)	2013-04-22 13:58:36 +10:00
Martin Schwenke	903f4c394c	scripts: Clean up update_tickles() and handling of associated directory Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 700cf95a1f29b4b88460a00a55d57a9e397011e0)	2013-04-19 13:13:36 +10:00

... 8 9 10 11 12 ...

1131 Commits