samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-27 03:21:53 +03:00

63 lines

1.6 KiB

C

Raw Normal View History

s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`/*`
			`Unix SMB/CIFS implementation.`
			`Implementation of a reliable server_exists()`
			`Copyright (C) Volker Lendecke 2010`

			`This program is free software; you can redistribute it and/or modify`
			`it under the terms of the GNU General Public License as published by`
			`the Free Software Foundation; either version 3 of the License, or`
			`(at your option) any later version.`

			`This program is distributed in the hope that it will be useful,`
			`but WITHOUT ANY WARRANTY; without even the implied warranty of`
			`MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the`
			`GNU General Public License for more details.`

			`You should have received a copy of the GNU General Public License`
			`along with this program. If not, see <http://www.gnu.org/licenses/>.`
			`*/`

			`#include "includes.h"`
lib: Add lib/util/server_id.h Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Ralph Boehme <slow@samba.org> 2017-01-01 23:00:55 +03:00			`#include "lib/util/server_id.h"`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`#include "serverid.h"`
lib/util: Add back control of mmap and hash size in tdb for top level build This passes down a struct loadparm_context to allow these parameters to be checked. This may be s3 or s4 context, allowing the #if _SAMBA_BUILD_ macro to go away safely. Andrew Bartlett 2011-10-12 16:01:08 +04:00			`#include "lib/param/param.h"`
s3-ctdb: Make use of CTDB_CONTROL_CHECK_SRVIDS This should be a lot quicker than PROCESS_EXISTS followed by looking at serverid.tdb Autobuild-User: Volker Lendecke <vlendec@samba.org> Autobuild-Date: Wed Nov 30 12:47:27 CET 2011 on sn-devel-104 2011-10-31 19:30:38 +04:00			`#include "ctdbd_conn.h"`
ctdb_conn: Use messaging_ctdb_connection Replace messaging_ctdbd_connection Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Ralph Boehme <slow@samba.org> 2017-06-16 14:00:59 +03:00			`#include "lib/messages_ctdb.h"`
lib: Use messaging_dgm_get_unique in serverid_exists This is a relevant change: I was experimenting with server_id_db_set_exclusive() in "net" and got failures all over the place. The main reason was that "net" by default does not do a serverid_register. With messaging_dgm we have the process' unique id available via the lockfile contents. Using open/read/close is a bit slower than local tdb access, but this version is safe for all processes which have done messaging_init() Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> 2015-09-30 02:11:08 +03:00			`#include "lib/messages_dgm.h"`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00
lib: Use messaging_dgm_get_unique in serverid_exists This is a relevant change: I was experimenting with server_id_db_set_exclusive() in "net" and got failures all over the place. The main reason was that "net" by default does not do a serverid_register. With messaging_dgm we have the process' unique id available via the lockfile contents. Using open/read/close is a bit slower than local tdb access, but this version is safe for all processes which have done messaging_init() Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> 2015-09-30 02:11:08 +03:00			`static bool serverid_exists_local(const struct server_id *id)`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`{`
lib: Use messaging_dgm_get_unique in serverid_exists This is a relevant change: I was experimenting with server_id_db_set_exclusive() in "net" and got failures all over the place. The main reason was that "net" by default does not do a serverid_register. With messaging_dgm we have the process' unique id available via the lockfile contents. Using open/read/close is a bit slower than local tdb access, but this version is safe for all processes which have done messaging_init() Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> 2015-09-30 02:11:08 +03:00			`bool exists = process_exists_by_pid(id->pid);`
			`uint64_t unique;`
			`int ret;`

			`if (!exists) {`
			`return false;`
			`}`

			`if (id->unique_id == SERVERID_UNIQUE_ID_NOT_TO_VERIFY) {`
			`return true;`
			`}`
Re-arrange the optimization to reduce tdb fcntl calls if smbd is not clustered. procid_is_me() is much cheaper to test and can optimize up to 50% of the calls to serverid_exists(). Volker please check. Autobuild-User: Jeremy Allison <jra@samba.org> Autobuild-Date: Sat Aug 20 01:15:07 CEST 2011 on sn-devel-104 2011-08-19 21:32:29 +04:00
lib: Use messaging_dgm_get_unique in serverid_exists This is a relevant change: I was experimenting with server_id_db_set_exclusive() in "net" and got failures all over the place. The main reason was that "net" by default does not do a serverid_register. With messaging_dgm we have the process' unique id available via the lockfile contents. Using open/read/close is a bit slower than local tdb access, but this version is safe for all processes which have done messaging_init() Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> 2015-09-30 02:11:08 +03:00			`ret = messaging_dgm_get_unique(id->pid, &unique);`
			`if (ret != 0) {`
s3: Fix serverid_exists In the cluster case it can happen that a node just died and we did not yet have the time to clean up serverid.tdb. If the corresponding serverid.tdb record that represented a process was migrated away from the dead record, it represents existence of a process where it is already dead. 2010-12-03 11:34:02 +03:00			`return false;`
			`}`

lib: Use messaging_dgm_get_unique in serverid_exists This is a relevant change: I was experimenting with server_id_db_set_exclusive() in "net" and got failures all over the place. The main reason was that "net" by default does not do a serverid_register. With messaging_dgm we have the process' unique id available via the lockfile contents. Using open/read/close is a bit slower than local tdb access, but this version is safe for all processes which have done messaging_init() Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> 2015-09-30 02:11:08 +03:00			`return (unique == id->unique_id);`
			`}`

			`bool serverid_exists(const struct server_id *id)`
			`{`
			`if (procid_is_local(id)) {`
			`return serverid_exists_local(id);`
			`}`

			`if (lp_clustering()) {`
ctdb_conn: Use messaging_ctdb_connection Replace messaging_ctdbd_connection Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Ralph Boehme <slow@samba.org> 2017-06-16 14:00:59 +03:00			`return ctdbd_process_exists(messaging_ctdb_connection(),`
lib: Add "unique_id" to ctdbd_process_exists Bug: https://bugzilla.samba.org/show_bug.cgi?id=13042 Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2017-08-29 14:26:20 +03:00			`id->vnn, id->pid, id->unique_id);`
lib: Use messaging_dgm_get_unique in serverid_exists This is a relevant change: I was experimenting with server_id_db_set_exclusive() in "net" and got failures all over the place. The main reason was that "net" by default does not do a serverid_register. With messaging_dgm we have the process' unique id available via the lockfile contents. Using open/read/close is a bit slower than local tdb access, but this version is safe for all processes which have done messaging_init() Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> 2015-09-30 02:11:08 +03:00			`}`

			`return false;`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`}`

63 lines 1.6 KiB C Raw Normal View History Unescape Escape

63 lines

1.6 KiB

C

Raw Normal View History