samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-22 13:34:15 +03:00

Author	SHA1	Message	Date
Martin Schwenke	4aa8e72d60	ctdb-recoverd: Rename update_local_flags() -> update_flags() This also updates remote flags so the name is misleading. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	702c7c4934	ctdb-recoverd: Change update_local_flags() to use already retrieved nodemaps BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	910a0b3b74	ctdb-recoverd: Get remote nodemaps earlier update_local_flags() will be changed to use these nodemaps. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	d50919b0cb	ctdb-recoverd: Do not fetch the nodemap from the recovery master The nodemap has already been fetched from the local node and is actually passed to this function. Care must be taken to avoid referencing the "remote" nodemap for the recovery master. It also isn't useful to do so, since it would be the same nodemap. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	762d1d8a96	ctdb-recoverd: Change get_remote_nodemaps() to use connected nodes The plan here is to use the nodemaps retrieved by get_remote_nodes() in update_local_flags(). This will improve efficiency, since get_remote_nodes() fetches flags from nodes in parallel. It also means that get_remote_nodes() can be used exactly once early on in main_loop() to retrieve remote nodemaps. Retrieving nodemaps multiple times is unnecessary and racy - a single monitoring iteration should not fetch flags multiple times and compare them. This introduces a temporary behaviour change but it will be of no consequence when the above changes are made. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	368c83bfe3	ctdb-recoverd: Fix node_pnn check and assignment of nodemap into array This array is indexed by the same index as nodemap, not the PNN. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	10ce0dbf1c	ctdb-recoverd: Add fail callback to assign banning credits Also drop error handling in main_loop() that is replaced by this change. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	a079ee3169	ctdb-recoverd: Add an intermediate state struct for nodemap fetching This will allow an error callback to be added. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	2eaa0af616	ctdb-recoverd: Move memory allocation into get_remote_nodemaps() BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	3324dd272c	ctdb-recoverd: Change signature of get_remote_nodemaps() Change 1st argument to a rec context, since this will be needed later. Drop the nodemap argument and access it via rec->nodemap instead. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	d2d90f2502	ctdb-recoverd: Fix a local memory leak The memory is allocated off the memory context used by the current iteration of main loop. It is freed when main loop completes the fix doesn't require backporting to stable branches. However, it is sloppy so it is worth fixing. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Martin Schwenke	52f520d39c	ctdb-recoverd: Basic cleanups for get_remote_nodemaps() Don't log an error on failure - let the caller can do this. Apart from this: fix up coding style and modernise the remaining error message. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-08-18 05:02:25 +00:00
Ralph Boehme	2327471756	lib: relicense smb_strtoul(l) under LGPLv3 Signed-off-by: Ralph Boehme <slow@samba.org> Reviewed-by: Swen Schillig <swen@linux.ibm.com> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Mon Aug 3 22:21:04 UTC 2020 on sn-devel-184	2020-08-03 22:21:02 +00:00
Martin Schwenke	5ce6133a75	ctdb-recoverd: Simplify calculation of new flags Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri Jul 24 06:03:23 UTC 2020 on sn-devel-184	2020-07-24 06:03:23 +00:00
Martin Schwenke	3654e41677	ctdb-recoverd: Correctly find nodemap entry for pnn Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-07-24 04:41:25 +00:00
Martin Schwenke	9475ab0441	ctdb-recoverd: Do not retrieve nodemap from recovery master It is already in rec->nodemap. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-07-24 04:41:25 +00:00
Martin Schwenke	0c6a7db3ba	ctdb-recoverd: Flatten update_flags_on_all_nodes() The logic currently in ctdb_ctrl_modflags() will be optimised so that it no longer matches the pattern for a control function. So, remove this function and squash its functionality into the only caller. Although there are some superficial changes, the behaviour is unchanged. Flattening the 2 functions produces some seriously weird logic for setting the new flags, to the point where using ctdb_ctrl_modflags() for this purpose now looks very strange. The weirdness will be cleaned up in a subsequent commit. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-07-24 04:41:25 +00:00
Martin Schwenke	a88c10c5a9	ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-07-24 04:41:25 +00:00
Martin Schwenke	b1e631ff92	ctdb-recoverd: Improve a call to update_flags_on_all_nodes() This should take a PNN, not an array index. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-07-24 04:41:25 +00:00
Martin Schwenke	915d24ac12	ctdb-recoverd: Use update_flags_on_all_nodes() This is clearer than using the MODFLAGS control directly. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-07-24 04:41:25 +00:00
Martin Schwenke	f681c0e947	ctdb-recoverd: Introduce some local variables to improve readability Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-07-24 04:41:25 +00:00
Martin Schwenke	cb3a3147b7	ctdb-recoverd: Change update_flags_on_all_nodes() to take rec argument This makes fields such as recmaster and nodemap easily available if required. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-07-24 04:41:25 +00:00
Martin Schwenke	6982fcb3e6	ctdb-recoverd: Drop unused nodemap argument from update_flags_on_all_nodes() An unused argument needlessly extends the length of function calls. A subsequent change will allow rec->nodemap to be used if necessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-07-24 04:41:25 +00:00
Martin Schwenke	53b73b9b0f	ctdb-daemon: Fix sorting of hot keys The current code only ever swaps with slot 0. This will only ever happen with slots 0 and 1, so probably never sorts. Replace with qsort(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-05-22 06:41:45 +00:00
Martin Schwenke	5c8dfbbf9b	ctdb-daemon: Add extra logging of hot keys ctdbd currently only logs when a new hot key is added. If a key gets hotter then nothing new is logged. Log hot key updates when the number of migrations has doubled since the last time that key was logged. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-05-22 06:41:45 +00:00
Martin Schwenke	baf058dcf7	ctdb-daemon: Update hot key logging This message indicates that a hot key was added, so say that. After all the hot key slots have been filled the id will always be 0, so don't bother logging it. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-05-22 06:41:44 +00:00
Martin Schwenke	1ab39b3270	ctdb-daemon: Fix bug in slot 0 comparison optimisation This is only valid if all slots are in use. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-05-22 06:41:44 +00:00
Martin Schwenke	f9f60c2a60	ctdb-daemon: Switch some variables to unsigned These should be unsigned but luck is currently on our side. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-05-22 06:41:44 +00:00
Martin Schwenke	21b9844bcb	ctdb-daemon: Add separate hot keys array for database statistics There are 2 reasons for this. Sorting of hot keys is broken and will be changed to an implementation that needs a named (i.e. not anonymous) structure. Also, at least one non-protocol field will be added to facilitate more useful logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-05-22 06:41:44 +00:00
Volker Lendecke	d9ccd853c3	ctdb: Implement CTDB_CONTROL_ECHO_DATA Testing control: 4 bytes msec delay plus a blob, return the request after the delay. This is an enhanced "ping" which can be used to test asynchronous clients. Doesn't have the full protocol implementation yet Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2020-04-28 09:08:39 +00:00
Volker Lendecke	ad4b53f2d9	ctdb: Fix a memleak Bug: https://bugzilla.samba.org/show_bug.cgi?id=14348 Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri Apr 17 08:32:35 UTC 2020 on sn-devel-184	2020-04-17 08:32:35 +00:00
Martin Schwenke	f8f3d7954d	ctdb-vacuum: Reschedule vacuum event if VacuumInterval has increased The vacuuming integration tests set VacuumInterval to a very high number to avoid vacuuming collisions. This is done after the cluster is healthy, so Samba will have already been started and vacuuming will already be scheduled at the default interval for databases attached by Samba. This means that vacuuming controls used by vacuuming tests can still collide with the scheduled vacuuming events. Add some logic to reschedule a vacuuming event that has fired but where VacuumInterval has increased since it was originally scheduled. The increase in VacuumInterval is used as the time offset for rescheduling the event. Although this changes production behaviour for the convenience of testing, the new behaviour is completely reasonable and obeys the principle of least surprise. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Apr 7 03:04:57 UTC 2020 on sn-devel-184	2020-04-07 03:04:57 +00:00
Martin Schwenke	5d03a3c86e	ctdb-vacuum: Store value of VacuumInterval in ctdb_vacuum_handle No behaviour change. This is final staging to make the next change completely obvious. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-04-07 01:26:41 +00:00
Martin Schwenke	7ad7c0b932	ctdb-vacuum: Use vacuum_handle local variables No behaviour change. This just makes future changes clearer by avoiding reformatting (or introducing local variables). Clean up error handling while touching a relevant line. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-04-07 01:26:41 +00:00
Martin Schwenke	716f52f68b	ctdb-recoverd: Avoid dereferencing NULL rec->nodemap Inside the nested event loop in ctdb_ctrl_getnodemap(), various asynchronous handlers may dereference rec->nodemap, which will be NULL. One example is lost_reclock_handler(), which causes rec->nodemap to be unconditionally dereferenced in list_of_nodes() via this call chain: list_of_nodes() list_of_active_nodes() set_recovery_mode() force_election() lost_reclock_handler() Instead of attempting to trace all of the cases, just avoid leaving rec->nodemap set to NULL. Attempting to use an old value is generally harmless, especially since it will be the same as the new value in most cases. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14324 Reported-by: Volker Lendecke <vl@samba.org> Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Mar 24 01:22:45 UTC 2020 on sn-devel-184	2020-03-24 01:22:45 +00:00
Martin Schwenke	147afe77de	ctdb-daemon: Don't allow attach from recovery if recovery is not active Neither the recovery daemon nor the recovery helper should attach databases outside of the recovery process. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	052f1bdb9c	ctdb-daemon: Remove more unused old client database functions BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	3a66d181b6	ctdb-recovery: Remove old code for creating missing databases BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	76a8174279	ctdb-recovery: Create database on nodes where it is missing BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	e6e63f8fb8	ctdb-recovery: Fetch database name from all nodes where it is attached BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	1bdfeb3fdc	ctdb-recovery: Pass db structure for each database recovery Instead of db_id and db_flags. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	c6f74e590f	ctdb-recovery: GET_DBMAP from all nodes This builds a complete list of databases across the cluster so it can be used to create databases on the nodes where they are missing. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	4c0b9c3605	ctdb-recovery: Replace use of ctdb_dbid_map with local db_list This will be used to build a merged list of databases from all nodes, allowing the recovery helper to create missing databases. It would be possible to also include the db_name field in this structure but that would cause a lot of churn. This field is used locally in the recovery of each database so can continue to live in the relevant state structure(s). BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	7e5a8a4884	ctdb-daemon: Respect CTDB_CTRL_FLAG_ATTACH_RECOVERY when attaching databases This is currently only set by the recovery daemon when it attaches missing databases, so there is no obvious behaviour change. However, attaching missing databases can now be moved to the recovery helper as long as it sets this flag. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	98e3d0db2b	ctdb-recovery: Use CTDB_CTRL_FLAG_ATTACH_RECOVERY to attach during recovery ctdb_ctrl_createdb() is only called by the recovery daemon, so this is a safe, temporary change. This is temporary because ctdb_ctrl_createdb(), create_missing_remote_databases() and create_missing_local_databases() will all go away soon. Note that this doesn't cause a change in behaviour. The main daemon will still only defer attaches from non-recoverd processes during recovery. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:38 +00:00
Martin Schwenke	fc23cd1b9c	ctdb-daemon: Remove unused old client database functions BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:37 +00:00
Martin Schwenke	c6c89495fb	ctdb-daemon: Fix database attach deferral logic Commit `3cc230b5ee` says: Dont allow clients to connect to databases untile we are well past and through the initial recovery phase It is unclear what this commit was attempting to do. The commit message implies that more attaches should be deferred but the code change adds a conjunction that causes less attaches to be deferred. In particular, no attaches will be deferred after startup is complete. This seems wrong. To implement what seems to be stated in the commit message an "or" needs to be used so that non-recovery daemon attaches are deferred either when in recovery or before startup is complete. Making this change highlights that attaches need to be allowed during the "startup" event because this is when smbd is started. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-03-23 23:45:37 +00:00
Amitay Isaacs	1c56d6413f	ctdb-recovery: Refactor banning a node into separate computation If a node is marked for banning, confirm that it's not become inactive during the recovery. If yes, then don't ban the node. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2020-03-23 23:45:37 +00:00
Amitay Isaacs	c6a0ff1bed	ctdb-recovery: Don't trust nodemap obtained from local node It's possible to have a node stopped, but recovery master not yet updated flags on the local ctdb daemon when recovery is started. So do not trust the list of active nodes obtained from the local node. Query the connected nodes to calculate the list of active nodes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2020-03-23 23:45:37 +00:00
Amitay Isaacs	6e2f8756f1	ctdb-recovery: Consolidate node state This avoids passing multiple arguments to async computation. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2020-03-23 23:45:37 +00:00
Amitay Isaacs	072ff4d12b	ctdb-recovery: Fetched vnnmap is never used, so don't fetch it New vnnmap is constructed using the information from all the connected nodes. So there is no need to fetch the vnnmap from recovery master. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2020-03-23 23:45:37 +00:00
Martin Schwenke	15762a3455	ctdb-daemon: more logical whitespace, debug modernisation BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Ralph Boehme <slow@samba.org>	2020-03-12 03:47:30 +00:00
Ralph Boehme	6a4fa0785f	ctdb-daemon: ensure restart() callback is called in half-connected state If NODE_FLAGS_DISCONNECTED is set the node can be in half-connected state. With this change we ensure to restart the transport for this case. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295 Signed-off-by: Ralph Boehme <slow@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2020-03-12 03:47:30 +00:00
Martin Schwenke	c9405aec70	ctdb-daemon: Check for lock count underflow This is a programming error. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-02-18 02:56:38 +00:00
Martin Schwenke	1a0e1f8924	ctdb-daemon: Fork when not interactive and test mode is enabled There is no sane way of keeping stdin open when using the shell to background ctdbd in local_daemons.sh. Instead, have ctdbd fork when not interactive and when test mode is enabled. become_daemon() can't be used for this: if it forks then it also closes stdin. For the interactive case, become_daemon() wasn't doing anything special, so do nothing instead. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-02-10 04:07:39 +00:00
Martin Schwenke	a220e9454a	ctdb-daemon: Make some conditions more explicit These don't need to depend on do_fork. Child logging should be set up whenever the daemon is not interactive. The stdin handler should be setup whenever test mode is enabled. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-02-10 04:07:39 +00:00
Martin Schwenke	cefb3327c6	ctdb-daemon: Pass more information to ctdb_start_daemon() No functional changes. This is staging for a change that makes ctdbd fork when test mode is enabled but interactive is not set. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-02-10 04:07:38 +00:00
Martin Schwenke	cf460bd9c4	ctdb-daemon: Shut down if interactive and stdin is closed This allows a test environment to simply close its end of a pipe to cleanly shutdown ctdbd. Like in smbd, this is only done if stdin is a pipe or a socket. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-01-28 09:57:32 +00:00
Martin Schwenke	d79e2dcfc8	ctdb-daemon: Only stop monitoring if it has been initialised This avoids a crash if ctdb_shutdown_sequence() is called before monitoring is initialised. Switch to using TALLOC_FREE() while touching this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-01-28 09:57:32 +00:00
Martin Schwenke	aa2977e151	ctdb-mutex: Change default re-check time for fcntl helper to 5s Testing against a commonly used cluster filesystem has shown no performance impact, as expected. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2020-01-21 11:39:40 +00:00
Volker Lendecke	42a3e2e503	ctdbd: Use struct initialization 2 lines less Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2020-01-19 18:29:39 +00:00
Björn Jacke	f3754b6487	ctdb/server/ctdb_daemon.c: typo fixes Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-31 00:43:39 +00:00
Björn Jacke	5d2a257c2e	ctdb/server/ctdb_client.c: typo fixes Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-31 00:43:39 +00:00
Björn Jacke	7722bd80fc	ctdb/server/ctdb_call.c: typo fixes Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-31 00:43:38 +00:00
Martin Schwenke	41a41d5f3e	ctdb-daemon: Implement DB_VACUUM control Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-10-24 04:06:43 +00:00
Martin Schwenke	d462d64cdf	ctdb-vacuum: Only schedule next vacuum event if vacuuuming is scheduled At the moment vacuuming is always scheduled. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-10-24 04:06:43 +00:00
Martin Schwenke	13cedaf019	ctdb-daemon: Factor out code to create vacuuming child This changes the behaviour for some failures from exiting to simply attempting to schedule the next run. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-10-24 04:06:43 +00:00
Martin Schwenke	5539edfdbe	ctdb-vacuum: Simplify recording of in-progress vacuuming child There can only be one, so simplify the logic. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-10-24 04:06:43 +00:00
Amitay Isaacs	d0cc9edc05	ctdb-vacuum: Avoid processing any more packets All the vacuum operations if required have an event loop to ensure completion of pending operations. Once all the steps are complete, there is no reason to process any more packets. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-24 04:06:43 +00:00
Amitay Isaacs	680df07630	ctdb-daemon: Avoid memory leak when packet is deferred Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-24 04:06:43 +00:00
Amitay Isaacs	c6427dddf5	ctdb-recoverd: No need for database detach handler The only reason for recoverd attaching to databases was to migrate records to the local node as part of vacuuming. Recovery daemon does not take part in database vacuuming any more. The actual database recovery is handled via the recovery_helper and recovery daemon should not need to attach to the databases any more. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-24 04:06:43 +00:00
Amitay Isaacs	fc81729dd2	ctdb-recoverd: Drop VACUUM_FETCH message handling This is now implemented in the ctdb daemon using VACUMM_FETCH control. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-24 04:06:43 +00:00
Amitay Isaacs	498932c0e8	ctdb-vacuum: Replace VACUUM_FETCH message with control Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-24 04:06:42 +00:00
Amitay Isaacs	86521837b6	ctdb-vacuum: Add processing of fetch queue Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-24 04:06:42 +00:00
Amitay Isaacs	da617f90d9	ctdb-daemon: Add implementation of VACUUM_FETCH control Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-24 04:06:42 +00:00
Martin Schwenke	815ae64400	ctdb-vacuum: Drop debug level of repacking message to NOTICE This occurs rarely but can adversely impact performance, so it is worth logging it more frequently. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-10-04 05:47:35 +00:00
Amitay Isaacs	33f1c9d965	ctdb-vacuum: Process all records not deleted on a remote node This currently skips the last record. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14147 RN: Avoid potential data loss during recovery after vacuuming error Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-10-04 05:47:34 +00:00
Mathieu Parent	7cb0ca4171	Spelling fixes s/ dont / don't / Excluding examples/tridge/smb.conf Signed-off-by: Mathieu Parent <math.parent@gmail.com> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Gary Lockyer <gary@catalyst.net.nz>	2019-09-01 22:21:27 +00:00
Mathieu Parent	736bb924f7	Spelling fixes s/ ot / to / Signed-off-by: Mathieu Parent <math.parent@gmail.com> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Gary Lockyer <gary@catalyst.net.nz>	2019-09-01 22:21:27 +00:00
Martin Schwenke	8190993d99	ctdb-recoverd: Fix typo in previous fix BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Aug 27 15:29:11 UTC 2019 on sn-devel-184	2019-08-27 15:29:11 +00:00
Martin Schwenke	5d655ac6f2	ctdb-recoverd: Only check for LMASTER nodes in the VNN map BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-08-21 11:50:30 +00:00
Martin Schwenke	e9f2e205ee	ctdb-daemon: Make node inactive in the NODE_STOP control Currently some of this is supported by a periodic check in the recovery daemon's main_loop(), which notices the flag change, sets recovery mode active and freezes databases. If STOP_NODE returns immediately then the associated recovery can complete and the node can be continued before databases are actually frozen. Instead, immediately do all of the things that make a node inactive. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087 RN: Stop "ctdb stop" from completing before freezing databases Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Aug 20 08:32:27 UTC 2019 on sn-devel-184	2019-08-20 08:32:27 +00:00
Martin Schwenke	91ac4c13d8	ctdb-daemon: Drop unused function ctdb_local_node_got_banned() BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-08-20 07:15:41 +00:00
Martin Schwenke	0f5f7b7cf4	ctdb-daemon: Switch banning code to use ctdb_node_become_inactive() There's no reason to avoid immediately setting recovery mode to active and initiating freeze of databases. This effectively reverts the following commits: `d8f3b490bb` `b4357a79d9` The latter is now implemented using a control, resulting in looser coupling. See also the following commit: `f8141e91a6` BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-08-20 07:15:41 +00:00
Martin Schwenke	a42bcaabb6	ctdb-daemon: Factor out new function ctdb_node_become_inactive() This is a superset of ctdb_local_node_got_banned() so will replace that function, and will also be used in the NODE_STOP control. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-08-20 07:15:41 +00:00
Martin Schwenke	3acb8e9d1c	ctdb-daemon: Add function ctdb_ip_to_node() This is the core logic from ctdb_ip_to_pnn(), so re-implement that that function using ctdb_ip_to_node(). Something similar (ctdb_ip_to_nodeid()) was recently removed in commit `010c1d77cd` because it wasn't required. Now there is a use case. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-08-16 21:30:35 +00:00
Martin Schwenke	6c9d1f855e	ctdb-daemon: Avoid signed/unsigned comparison by casting Compiling with -Wsign-compare complains: 1047 \| && (call->call_id == CTDB_FETCH_WITH_HEADER_FUNC)) { \| ^~ struct ctdb_call is a protocol element, so we can't simply change it. Found by csbuild. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Aug 14 10:29:59 UTC 2019 on sn-devel-184	2019-08-14 10:29:59 +00:00
Martin Schwenke	4bdfbbd8d4	ctdb-daemon: Avoid signed/unsigned comparison by declaring as unsigned Compiling with -Wsign-compare complains: ctdb/server/ctdb_call.c:831:12: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare] 831 \| if (count <= ctdb_db->statistics.hot_keys[0].count) { \| ^~ and ctdb/server/ctdb_call.c:844:13: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare] 844 \| if (count <= ctdb_db->statistics.hot_keys[i].count) { \| ^~ Found by cs-build. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-08-14 09:11:36 +00:00
Martin Schwenke	41cd44724e	ctdb-mutex: Add support for exiting if the lock file disappears If the lock file is inaccessible or the inode number changes then the lock is lost, so exit. This allows the recovery daemon to trigger an election. The ensuing recovery will re-take the lock. By default the lock file is checked every 60 seconds. A lot can happen in 60 seconds but being more aggressive and accessing the lock too often could result in a performance issue for the cluster filesystem. An new optional 2nd argument is added, which is the lock file re-check time in seconds. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-26 03:34:17 +00:00
Martin Schwenke	af8de1bcfd	ctdb-mutex: Add an intermediate asynchronous computation for waiting This will allow more conditions to be waited on via additional sub-requests. At the moment this just completes when the parent wait completes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-26 03:34:17 +00:00
Martin Schwenke	fae8e438f0	ctdb-mutex: Change parent checking to use an asynchronous computation Put the checking for the process being immediately re-parented into the computation too. This will be very rare and doing it consistently makes testing saner. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-26 03:34:17 +00:00
Martin Schwenke	2f768a090e	ctdb-mutex: Exit immediately if the lock isn't taken There is no need to wait until the parent kills the helper. The parent will get the initial response, indicating contention or similar, and will then get a separate event indicating that the pipe is gone. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-26 03:34:17 +00:00
Martin Schwenke	2b6f1a8ee6	ctdb-mutex: Drop dependency on ctdb_set_helper This makes the code more explicit and makes testing easier due to less dependencies. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-26 03:34:17 +00:00
Martin Schwenke	76ab0a2b82	ctdb-mutex: Drop unneeded assignment clang warns: ctdb/server/ctdb_mutex_fcntl_helper.c:61:3: warning: Value stored to 'fd' is never read Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-26 03:34:16 +00:00
Martin Schwenke	98169241ef	ctdb-mutex: Update to use modern debug macro One of these had a missing space, so this implicitly fixes it. It also drops the need to unnecessarily include common.h, which comes with some dependency baggage. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-26 03:34:16 +00:00
Martin Schwenke	6fe963c3f7	ctdb-recoverd: Periodically log recovery master of incomplete cluster Only do this if the recovery lock is unset. Log every minute for the first 10 minutes, then every 10 minutes, then every hour. This is useful for determining whether a split brain occurred. It is particularly useful if logging failed or was throttled at startup, so there is no evidence of the split brain when it began. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-26 03:34:16 +00:00
Martin Schwenke	f2559ef8ce	ctdb-recoverd: Log the master at the end of elections Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-26 03:34:16 +00:00
Martin Schwenke	755a9e654f	ctdb-daemon: Don't check if lock_ctx->ctdb_db is NULL This can never be NULL. It could probably be NULL in the past when "all database" locks existed. There are paths where is is checked for NULL and then later dereferenced, causing static analysers to produce spurious warnings. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:25 +00:00
Martin Schwenke	79a7cc3fb9	ctdb-daemon: Drop unused function ctdb_vfork_with_logging() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:24 +00:00
Martin Schwenke	75a808fd86	ctdb-daemon: Don't index by PNN when initialising node flags Indexing by PNN is wrong. This also removes a signed/unsigned comparison because the PNN is not compared to -1 anymore. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:24 +00:00
Martin Schwenke	010c1d77cd	ctdb-daemon: Replace function ctdb_ip_to_nodeid() with ctdb_ip_to_pnn() Node ID is a poorly defined concept, indicating the slot in the node map where the IP address was found. This signed value also ends up compared to num_nodes, which is unsigned, producing unwanted warnings. Just return the PNN because this what both callers really want. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:23 +00:00
Martin Schwenke	4c24d434b9	ctdb-cluster-mutex: Ensure that the configured command is not empty ... and does not just contain whitespace. Otherwise NULL can be passed as the first argument to execv(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:23 +00:00
Martin Schwenke	9c75ad6818	ctdb-daemon: Drop unused values assigned to variable Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:23 +00:00
Martin Schwenke	c39441f62d	ctdb-daemon: Fix signed/unsigned comparisons by using constant Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:23 +00:00
Martin Schwenke	76e930d784	ctdb-daemon: Fix signed/unsigned comparisons by casting Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:23 +00:00
Martin Schwenke	1e47a1b3f6	ctdb-daemon: Fix signed/unsigned comparisons by declaring as unsigned Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:23 +00:00
Martin Schwenke	3ccce53e3e	ctdb-daemon: Make type of list_of_nodes() consistent with callers Instead of taking exclude_pnn as a parameter, calculate it from an include_self_parameter, which is passed through from the 2 calling functions. While doing this, fix a signed/unsigned comparison issue by declaring the new exclude_pnn local variable as an unsigned type. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:22 +00:00
Martin Schwenke	6556347901	ctdb-daemon: Make old list_of_nodes() function static The next commit will change the type of this function, which is only used in this file. So, make it static to isolate the change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-07-05 05:03:22 +00:00
Swen Schillig	73640b8ad8	ctdb: Update all consumers of strtoul_err(), strtoull_err() to new API Signed-off-by: Swen Schillig <swen@linux.ibm.com> Reviewed-by: Ralph Boehme <slow@samba.org> Reviewed-by: Christof Schmitt <cs@samba.org>	2019-06-30 11:32:18 +00:00
Martin Schwenke	b1d83fb3e8	ctdb-daemon: Attempt to silence CID 1357985 (Unchecked return value) Yes, the other callers check the return value of ctdb_lockdb_mark(). However, this is called in a void function and ctdb_lockdb_mark() has already printed any error message. All we can do is explicitly ignore the return value. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-06-05 10:25:50 +00:00
Martin Schwenke	2db0e71d3b	ctdb-ipalloc: Fix warning about unused value assigned to srcimbl To make this much clearer, move the declaration into the scope where it is used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-06-05 10:25:50 +00:00
Martin Schwenke	7df15b246a	ctdb-ipalloc: Avoid -1 as a PNN, use CTDB_UNKNOWN_PNN instead This fixes warnings about signed versus unsigned comparisons. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-06-05 10:25:50 +00:00
Martin Schwenke	86666d6570	ctdb-ipalloc: Fix signed/unsigned comparisons by declaring as unsigned Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-06-05 10:25:50 +00:00
Martin Schwenke	90622ab901	ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables and function parameters need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-06-05 10:25:50 +00:00
Martin Schwenke	35368d871d	ctdb-recovery: Avoid -1 as a PNN, use CTDB_UNKNOWN_PNN instead Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-06-05 10:25:50 +00:00
Martin Schwenke	978c7dbd55	ctdb-recovery: Fix signed/unsigned comparison by casting Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-06-05 10:25:50 +00:00
Martin Schwenke	fa7bd35b6a	ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-06-05 10:25:50 +00:00
Stefan Metzmacher	b9b3acf23e	ctdb:takeover: add better debugging when a client connects to a non public address Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-06-04 22:13:07 +00:00
Noel Power	71e7b5d14e	ctdb/server: cppcheck: fix shiftTooManyBitsSigned error Fixes ctdb/server/ipalloc_lcp2.c:61: error: shiftTooManyBitsSigned: Shifting signed 32-bit value by 31 bits is undefined behaviour <--[cppcheck] Signed-off-by: Noel Power <noel.power@suse.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2019-06-04 22:13:07 +00:00
Volker Lendecke	e7424897a1	ctdb: Make TDB_SEQNUM work synchronously with ctdb Old war story completely from memory, I could not find the commit that introduced TDB_SEQNUM so far...: Back in the days when ctdb was initially developed, TDB_SEQNUM's only user was the notify.tdb that held one huge record for all notify records. With that use case in mind it made perfect sense to keep the SEQNUM stable locally, sacrificing precision. By now notify.tdb is long gone, an the only user of TDB_SEQNUM right now is brlock.tdb, which contains special case code for the imprecise ctdb implementation of TDB_SEQNUM. With this commit, that special code can go: The TDB_SEQNUM will also increment when just the DMASTER header field changes, indicating to smbd that someone else might have changed the record. This will of course increase the SEQNUM frequency, but it should not increase the load on ctdb: If you look at the brlock.c workaround, it just does not do the caching that is possible with precise TDB_SEQNUMs working. How did I get here? I want to move brl_num_read_oplocks() from brlock.tdb into locking.tdb, and for that I need precise TDB_SEQNUMs for locking.tdb. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Fri May 24 00:42:17 UTC 2019 on sn-devel-184	2019-05-24 00:42:17 +00:00
Martin Schwenke	6a2941e2a9	ctdb-recoverd: Fix memory leak state is always freed before exiting this function, so allocate fde off it instead of long-lived ctdb context. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13943 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-05-14 07:25:37 +00:00
Martin Schwenke	8663e0a64f	ctdb-daemon: Never use 0 as a client ID ctdb_control_db_attach() and ctdb_control_db_detach() assume that any control with client ID 0 comes from another daemon and treat it specially. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13930 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-05-13 07:27:24 +00:00
Martin Schwenke	95477e69e3	ctdb-daemon: Log when ctdbd CPU utilisation exceeds a threshold This is to help us notice when ctdbd is using the full capacity of a CPU, so is saturated. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-05-07 05:45:34 +00:00
Volker Lendecke	43cacaad57	ctdb: Fix a typo Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Sat Apr 6 11:51:55 UTC 2019 on sn-devel-144	2019-04-06 11:51:55 +00:00
Volker Lendecke	bb1e32297e	ctdb: Slightly simplify ctdb_ltdb_lock_fetch_requeue Reduce indentation with an early return Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2019-04-06 10:47:13 +00:00
Amitay Isaacs	edd4a23d76	ctdb-version: Simplify version string usage There is no need to write SAMBA_VERSION_STRING as CTDB_VERSION_STRING. Wherever required use SAMBA_VERSION_STRING directly. Avoids the confusion with two version.h files. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13789 Signed-off-by: Amitay Isaacs <amitay@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Mar 15 06:31:50 UTC 2019 on sn-devel-144	2019-03-15 06:31:50 +00:00
Martin Schwenke	8c2ff3f2b5	ctdb-daemon: Add an environment variable to set version This can be used to test the version checking logic. Cache the version to avoid re-checking the environment variable each time. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@samba.org>	2019-03-15 05:17:14 +00:00
Amitay Isaacs	278eb236ae	ctdb-daemon: Fix maybe-uninitialized error with picky developer 263/386] Compiling ctdb/server/ctdb_recovery_helper.c In file included from ../../server/ctdb_recovery_helper.c:24:0: ../../server/ctdb_recovery_helper.c: In function ‘main’: ../../../lib/talloc/talloc.h:911:34: error: ‘mem_ctx’ may be used uninitialized in this function [-Werror=maybe-uninitialized] #define TALLOC_FREE(ctx) do { if (ctx != NULL) { talloc_free(ctx); ctx=NULL; } } while(0) Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Jeremy Allison <jra@samba.org>	2019-03-01 17:21:15 +00:00
Swen Schillig	55acae774a	ctdb-server: Use wrapper for string to integer conversion In order to detect an value overflow error during the string to integer conversion with strtoul/strtoull, the errno variable must be set to zero before the execution and checked after the conversion is performed. This is achieved by using the wrapper function strtoul_err and strtoull_err. Signed-off-by: Swen Schillig <swen@linux.ibm.com> Reviewed-by: Ralph Böhme <slow@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2019-03-01 00:32:11 +00:00
Martin Schwenke	c93430fe8f	ctdb-cluster-mutex: Separate out command and file handling This code is difficult to read and there really is no common code between the 2 cases. For example, there is no need to split a filename into words. Separating each of the 2 cases into its own function makes the logic much easier to understand. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Feb 25 03:40:16 CET 2019 on sn-devel-144	2019-02-25 03:40:16 +01:00
Martin Schwenke	13a1a48089	ctdb-recoverd: Time out attempt to take recovery lock after 120s Currently this will wait forever. It really needs a timeout in case the cluster filesystem (or other lock mechanism) is completely wedged. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-02-25 02:12:17 +01:00
Martin Schwenke	45a77d65b2	ctdb-recoverd: Ban node on unknown error when taking recovery lock We really shouldn't see unknown errors. They probably represent a misconfigured recovery lock or similar. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-02-25 02:12:17 +01:00
Martin Schwenke	c0fb62ed39	ctdb-recoverd: Make recoverd context available in recovery lock handle BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-02-25 02:12:16 +01:00
Martin Schwenke	7e4aae6943	ctdb-recoverd: Clean up logging on failure to take recovery lock Add an explicit case for a timeout and clean up the other messages. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-02-25 02:12:16 +01:00
Martin Schwenke	621658cbed	ctdb-recoverd: Free cluster mutex handler on failure to take lock If nested events occur while the file descriptor handler is still active then chaos can ensue. For example, if a node is banned and the lock is explicitly cancelled (e.g. due to election loss) then double-talloc-free()s abound. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2019-02-25 02:12:16 +01:00
Martin Schwenke	944c92a15d	ctdb-daemon: Modernise debug during record deletion for vacuuming Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Dec 18 10:13:50 CET 2018 on sn-devel-144	2018-12-18 10:13:50 +01:00
Martin Schwenke	cdca0d7e78	ctdb-daemon Add extra debug during record deletion for vacuuming It isn't currently possible to distinguish these 2 cases. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-12-18 07:12:10 +01:00
Martin Schwenke	f1b594dce1	ctdb-daemon: Do not force full vacuum on first vacuuming run When the number of fast path vacuuming runs is 0 then a full vacuuming run is done. This means the first one is a full run, which is almost certainly not what is intended. Combine the 2 conditionals to only flag a full vacuuming run when the count exceeds the configured limit. This means that the full_vacuum_run flag is set in both parent and child, but this is harmless... and is better than getting it wrong. Also tweak the comparison to be less-than-or-equal, since the zeroth run needs to be counted. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-12-18 07:12:09 +01:00
Martin Schwenke	da8aaf2aee	ctdb-recoverd: Call an election when the recovery lock is lost The lock may have been lost due to a failure in the underlying locking mechanism. This could be due to quorum loss or similar. It is best to call an election to confirm that this node should still be master. At worst, the node will reelect itself, fail to take the lock and then ban itself. This is a suitable outcome for a node that has been partitioned from others in the cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-12-18 02:02:03 +01:00
Martin Schwenke	93284ed032	ctdb-daemon: Divide by 2 when calculating hop count bucket This provides finer resolution while still maintaining a reasonable maximum. In this case the top bucket contains any hop counts >= 16384, compared to the current situation where the top bucket contains hop counts >= 268435456. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-12-18 02:02:03 +01:00
Martin Schwenke	dd7574afd1	ctdb-daemon: Exit with error if a database directory does not exist Since 4.9.0, the log messages can be confusing if a required database directory does not exist. Explicitly check for database directories, logging a clear error and exiting if one is missing. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13696 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Dec 3 06:56:41 CET 2018 on sn-devel-144	2018-12-03 06:56:41 +01:00
Andreas Schneider	2d512b278e	debug: Use debuglevel_(get\|set) function Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org> Autobuild-Date(master): Thu Nov 8 11:03:11 CET 2018 on sn-devel-144	2018-11-08 11:03:11 +01:00
Martin Schwenke	6e16e95f74	ctdb-daemon: Do not fork when CTDB_TEST_MODE is set Explicitly background ctdbd instead. This has the advantage of leaving stdin open. ctdbd can then be enhanced to exit when stdin closes, allowing better cleanup in a test environment. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Nov 6 10:30:14 CET 2018 on sn-devel-144	2018-11-06 10:30:14 +01:00
Martin Schwenke	01f6fbba4e	ctdb-daemon: Switch interactive variable to a bool popt uses an int in place of a bool, so declare an extra int and make the conversion explicit. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-11-06 07:16:18 +01:00
Martin Schwenke	4e6bd42493	ctdb-daemon: Improve documentation for -i option Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-11-06 07:16:15 +01:00
Martin Schwenke	9c41481f21	ctdb-daemon: Don't set log_to_stdout for become_daemon() ctdbd logs to stderr in interactive mode, not stdout. This way stdout is always closed. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-11-06 07:16:15 +01:00
Martin Schwenke	c84254d23d	ctdb-daemon: Avoid unnecessarily spamming the logs when in test mode Logging the logging location to syslog can be useful on production systems when the configuration goes unexpectedly missing. However, in test mode this just adds noise to the logs on the test system. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-11-06 07:16:14 +01:00
Martin Schwenke	d75fa2c3fd	ctdb-daemon: Drop unused function ctdb_set_socketname() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-11-06 07:16:14 +01:00
Martin Schwenke	5f478b7c5f	ctdb-daemon: Use path functions for socket and PID file Drop the use of ctdb_set_sockname() because it complicates the memory allocation and this is the only place it is used. Just assign to the relevant pointer. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2018-11-06 07:16:14 +01:00
Martin Schwenke	27df4f002a	ctdb-recovery: Ban a node that causes recovery failure ... instead of applying banning credits. There have been a couple of cases where recovery repeatedly takes just over 2 minutes to fail. Therefore, banning credits expire between failures and a continuously problematic node is never banned, resulting in endless recoveries. This is because it takes 2 applications of banning credits before a node is banned, which generally involves 2 recovery failures. The recovery helper makes up to 3 attempts to recover each database during a single run. If a node causes 3 failures then this is really equivalent to 3 recovery failures in the model that existed before the recovery helper added retries. In that case the node would have been banned after 2 failures. So, instead of applying banning credits to the "most failing" node, simply ban it directly from the recovery helper. If multiple nodes are causing recovery failures then this can cause a node to be banned more quickly than it might otherwise have been, even pre-recovery-helper. However, 90 seconds (i.e. 3 failures) is a long time to be in recovery, so banning earlier seems like the best approach. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13670 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Nov 5 06:52:33 CET 2018 on sn-devel-144	2018-11-05 06:52:33 +01:00

1 2 3 4 5 ...

2602 Commits