samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-25 23:21:54 +03:00

672 lines

17 KiB

C

Raw Normal View History

s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`/*`
			`Samba Unix/Linux SMB client library`
			`net serverid commands`
			`Copyright (C) Volker Lendecke 2010`

			`This program is free software; you can redistribute it and/or modify`
			`it under the terms of the GNU General Public License as published by`
			`the Free Software Foundation; either version 3 of the License, or`
			`(at your option) any later version.`

			`This program is distributed in the hope that it will be useful,`
			`but WITHOUT ANY WARRANTY; without even the implied warranty of`
			`MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the`
			`GNU General Public License for more details.`

			`You should have received a copy of the GNU General Public License`
			`along with this program. If not, see <http://www.gnu.org/licenses/>.`
			`*/`

			`#include "includes.h"`
			`#include "utils/net.h"`
s3:dbwrap: move all .c and .h files of dbwrap to lib/dbwrap/ Autobuild-User: Michael Adam <obnox@samba.org> Autobuild-Date: Fri Jul 29 13:34:22 CEST 2011 on sn-devel-104 2011-07-07 19:42:08 +04:00			`#include "dbwrap/dbwrap.h"`
s3:net: new implementation of "servid wipedbs" with smbXsrv_* Signed-off-by: Gregor Beck <gbeck@sernet.de> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Tue Feb 19 13:56:57 CET 2013 on sn-devel-104 2012-12-13 16:00:28 +04:00			`#include "dbwrap/dbwrap_rbt.h"`
s3-server_id: only include server_id where needed. Guenther 2011-02-25 01:05:57 +03:00			`#include "serverid.h"`
s3-sessionid: avoid global include of sessionid.h Guenther Autobuild-User: Günther Deschner <gd@samba.org> Autobuild-Date: Wed Mar 2 12:58:12 CET 2011 on sn-devel-104 2011-02-25 01:14:15 +03:00			`#include "session.h"`
s3:lib: split things into a conn_tdb.h metze Autobuild-User: Stefan Metzmacher <metze@samba.org> Autobuild-Date: Tue Jun 5 19:28:35 CEST 2012 on sn-devel-104 2012-06-04 17:32:28 +04:00			`#include "lib/conn_tdb.h"`
s3:net: new implementation of "servid wipedbs" with smbXsrv_* Signed-off-by: Gregor Beck <gbeck@sernet.de> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Tue Feb 19 13:56:57 CET 2013 on sn-devel-104 2012-12-13 16:00:28 +04:00			`#include "smbd/globals.h"`
			`#include "util_tdb.h"`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00
			`static int net_serverid_list_fn(const struct server_id *id,`
			`uint32_t msg_flags, void *priv)`
			`{`
lib/util Bring procid_str() into lib/util as server_id_string() This is needed for OpenChange, which prints Samba struct server_id values in debug messages. Andrew Bartlett 2011-06-08 08:05:55 +04:00			`char *str = server_id_str(talloc_tos(), id);`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`d_printf("%s %llu 0x%x\n", str, (unsigned long long)id->unique_id,`
			`(unsigned int)msg_flags);`
			`TALLOC_FREE(str);`
			`return 0;`
			`}`

			`static int net_serverid_list(struct net_context *c, int argc,`
			`const char **argv)`
			`{`
			`d_printf("pid unique_id msg_flags\n");`
s3:net: fix the exit code of net serverid list 2011-08-17 12:09:57 +04:00			`return serverid_traverse_read(net_serverid_list_fn, NULL) ? 0 : -1;`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`}`

			`static int net_serverid_wipe_fn(struct db_record *rec,`
			`const struct server_id *id,`
			`uint32_t msg_flags, void *private_data)`
			`{`
			`NTSTATUS status;`

			`if (id->vnn != get_my_vnn()) {`
			`return 0;`
			`}`
s3:net: convert net serverid to only use dbwrap wrapper functions Avoid direct use of the db_record and db_context structs. 2011-08-25 12:39:53 +04:00			`status = dbwrap_record_delete(rec);`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`if (!NT_STATUS_IS_OK(status)) {`
lib/util Bring procid_str() into lib/util as server_id_string() This is needed for OpenChange, which prints Samba struct server_id values in debug messages. Andrew Bartlett 2011-06-08 08:05:55 +04:00			`char *str = server_id_str(talloc_tos(), id);`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`DEBUG(1, ("Could not delete serverid.tdb record %s: %s\n",`
			`str, nt_errstr(status)));`
			`TALLOC_FREE(str);`
			`}`
			`return 0;`
			`}`

			`static int net_serverid_wipe(struct net_context *c, int argc,`
			`const char **argv)`
			`{`
s3:net: fix the exit code of net serverid wipe 2011-08-17 12:10:45 +04:00			`return serverid_traverse(net_serverid_wipe_fn, NULL) ? 0 : -1;`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`}`

s3:net: new implementation of "servid wipedbs" with smbXsrv_* Signed-off-by: Gregor Beck <gbeck@sernet.de> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Tue Feb 19 13:56:57 CET 2013 on sn-devel-104 2012-12-13 16:00:28 +04:00
			`struct wipedbs_record_marker {`
			`struct wipedbs_record_marker prev, next;`
			`TDB_DATA key, val;`
			`const char *desc;`
			`};`

			`struct wipedbs_server_data {`
			`struct server_id server_id;`
			`const char *server_id_str;`
			`bool exists;`
			`struct wipedbs_record_marker *session_records;`
			`struct wipedbs_record_marker *tcon_records;`
			`struct wipedbs_record_marker *open_records;`
			`};`

			`struct wipedbs_state {`
			`struct db_context *id2server_data;`
			`struct {`
			`struct {`
			`int total;`
			`int existing;`
			`int disconnected;`
			`} server;`
			`struct {`
			`int total;`
			`int disconnected;`
			`int todelete;`
			`int failure;`
			`} session, tcon, open;`
			`int open_timed_out;`
			`} stat;`
			`struct server_id *server_ids;`
			`bool *server_exists;`
			`int idx;`
			`struct db_context *session_db;`
			`struct db_context *tcon_db;`
			`struct db_context *open_db;`
			`struct timeval now;`
			`bool testmode;`
			`bool verbose;`
			`};`

			`static struct wipedbs_server_data get_server_data(struct wipedbs_state state,`
			`const struct server_id *id)`
			`{`
			`struct wipedbs_server_data *ret = NULL;`
			`TDB_DATA key, val = tdb_null;`
			`NTSTATUS status;`

			`key = make_tdb_data((const void*)&id->unique_id, sizeof(id->unique_id));`
			`status = dbwrap_fetch(state->id2server_data, talloc_tos(), key, &val);`
			`if (NT_STATUS_IS_OK(status)) {`
			`ret = (struct wipedbs_server_data*) val.dptr;`
			`TALLOC_FREE(val.dptr);`
			`} else if (NT_STATUS_EQUAL(status, NT_STATUS_NOT_FOUND)) {`
			`ret = talloc_zero(state->id2server_data,`
			`struct wipedbs_server_data);`
			`if (ret == NULL) {`
			`DEBUG(0, ("Failed to allocate server entry for %s\n",`
			`server_id_str(talloc_tos(), id)));`
			`goto done;`
			`}`
			`ret->server_id = *id;`
			`ret->server_id_str = server_id_str(ret, id);`
			`ret->exists = true;`
			`val = make_tdb_data((const void*)&ret, sizeof(ret));`
			`status = dbwrap_store(state->id2server_data,`
			`key, val, TDB_INSERT);`
			`if (!NT_STATUS_IS_OK(status)) {`
			`DEBUG(0, ("Failed to store server entry for %s: %s\n",`
			`server_id_str(talloc_tos(), id),`
			`nt_errstr(status)));`
			`}`
			`goto done;`
			`} else {`
			`DEBUG(0, ("Failed to fetch server entry for %s: %s\n",`
			`server_id_str(talloc_tos(), id), nt_errstr(status)));`
			`goto done;`
			`}`
			`if (!server_id_equal(id, &ret->server_id)) {`
			`DEBUG(0, ("uniq id collision for %s and %s\n",`
			`server_id_str(talloc_tos(), id),`
			`server_id_str(talloc_tos(), &ret->server_id)));`
			`smb_panic("server_id->unique_id not unique!");`
			`}`
			`done:`
			`return ret;`
			`}`

			`static int wipedbs_traverse_sessions(struct smbXsrv_session_global0 *session,`
			`void *wipedbs_state)`
			`{`
			`struct wipedbs_state *state =`
			`talloc_get_type_abort(wipedbs_state,`
			`struct wipedbs_state);`
			`struct wipedbs_server_data *sd;`
			`struct wipedbs_record_marker *rec;`
			`TDB_DATA tmp;`
			`int ret = -1;`

			`assert(session->num_channels == 1);`

			`state->stat.session.total++;`

			`sd = get_server_data(state, &session->channels[0].server_id);`
			`if (sd == NULL) {`
			`goto done;`
			`}`

			`if (server_id_is_disconnected(&sd->server_id)) {`
			`state->stat.session.disconnected++;`
			`}`

			`rec = talloc_zero(sd, struct wipedbs_record_marker);`
			`if (rec == NULL) {`
			`DEBUG(0, ("Out of memory!\n"));`
			`goto done;`
			`}`

			`tmp = dbwrap_record_get_key(session->db_rec);`
			`rec->key = tdb_data_talloc_copy(rec, tmp);`
			`tmp = dbwrap_record_get_value(session->db_rec);`
			`rec->val = tdb_data_talloc_copy(rec, tmp);`

			`rec->desc = talloc_asprintf(`
Fix the build: net_serverid.c has 3 wrong format strings for 64bit vars On two of my opensuse machines i get 3 errors, e.g.: ../source3/utils/net_serverid.c:333:3: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘uint64_t’ [-Werror=format] cc1: some warnings being treated as errors Signed-off-by: Guenter Kukkukk <kukks@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Tue Mar 5 22:49:03 CET 2013 on sn-devel-104 2013-03-05 23:08:49 +04:00			`rec, "session[global: %u wire: %llu]",`
			`session->session_global_id,`
			`(long long unsigned)session->session_wire_id);`
s3:net: new implementation of "servid wipedbs" with smbXsrv_* Signed-off-by: Gregor Beck <gbeck@sernet.de> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Tue Feb 19 13:56:57 CET 2013 on sn-devel-104 2012-12-13 16:00:28 +04:00
			`if ((rec->key.dptr == NULL) \|\| (rec->val.dptr == NULL) \|\|`
			`(rec->desc == NULL))`
			`{`
			`DEBUG(0, ("Out of memory!\n"));`
			`goto done;`
			`}`

			`state->session_db = dbwrap_record_get_db(session->db_rec);`

			`DLIST_ADD(sd->session_records, rec);`
			`ret = 0;`
			`done:`
			`return ret;`
			`}`

			`static int wipedbs_traverse_tcon(struct smbXsrv_tcon_global0 *tcon,`
			`void *wipedbs_state)`
			`{`
			`struct wipedbs_state *state =`
			`talloc_get_type_abort(wipedbs_state,`
			`struct wipedbs_state);`
			`struct wipedbs_server_data *sd;`
			`struct wipedbs_record_marker *rec;`
			`TDB_DATA tmp;`
			`int ret = -1;`

			`state->stat.tcon.total++;`

			`sd = get_server_data(state, &tcon->server_id);`
			`if (sd == NULL) {`
			`goto done;`
			`}`

			`if (server_id_is_disconnected(&sd->server_id)) {`
			`state->stat.tcon.disconnected++;`
			`}`

			`rec = talloc_zero(sd, struct wipedbs_record_marker);`
			`if (rec == NULL) {`
			`DEBUG(0, ("Out of memory!\n"));`
			`goto done;`
			`}`

			`tmp = dbwrap_record_get_key(tcon->db_rec);`
			`rec->key = tdb_data_talloc_copy(rec, tmp);`
			`tmp = dbwrap_record_get_value(tcon->db_rec);`
			`rec->val = tdb_data_talloc_copy(rec, tmp);`

			`rec->desc = talloc_asprintf(`
			`rec, "tcon[global: %u wire: %u session: %u share: %s]",`
			`tcon->tcon_global_id, tcon->tcon_wire_id,`
			`tcon->session_global_id, tcon->share_name);`

			`if ((rec->key.dptr == NULL) \|\| (rec->val.dptr == NULL) \|\|`
			`(rec->desc == NULL))`
			`{`
			`DEBUG(0, ("Out of memory!\n"));`
			`goto done;`
			`}`

			`state->tcon_db = dbwrap_record_get_db(tcon->db_rec);`

			`DLIST_ADD(sd->tcon_records, rec);`
			`ret = 0;`

			`done:`
			`return ret;`
			`}`

			`static int wipedbs_traverse_open(struct smbXsrv_open_global0 *open,`
			`void *wipedbs_state)`
			`{`
			`struct wipedbs_state *state =`
			`talloc_get_type_abort(wipedbs_state,`
			`struct wipedbs_state);`
			`struct wipedbs_server_data *sd;`
			`struct wipedbs_record_marker *rec;`
			`TDB_DATA tmp;`
			`int ret = -1;`

			`state->stat.open.total++;`

			`sd = get_server_data(state, &open->server_id);`
			`if (sd == NULL) {`
			`goto done;`
			`}`

			`if (server_id_is_disconnected(&sd->server_id)) {`
			`struct timeval disconnect_time;`
			`int64_t tdiff;`
			`bool reached;`

			`state->stat.open.disconnected++;`

			`nttime_to_timeval(&disconnect_time, open->disconnect_time);`
			`tdiff = usec_time_diff(&state->now, &disconnect_time);`
			`reached = (tdiff >= 1000*open->durable_timeout_msec);`

			`if (state->verbose) {`
			`TALLOC_CTX *mem_ctx = talloc_new(talloc_tos());`
			`d_printf("open[global: %u] disconnected at "`
			`"[%s] %us ago with timeout of %us "`
			`"-%s reached\n",`
			`open->open_global_id,`
			`nt_time_string(mem_ctx, open->disconnect_time),`
			`(unsigned)(tdiff/1000000),`
			`open->durable_timeout_msec / 1000,`
			`reached ? "" : " not");`
			`talloc_free(mem_ctx);`
			`}`

			`if (!reached) {`
			`ret = 0;`
			`goto done;`
			`}`
			`state->stat.open_timed_out++;`
			`}`

			`rec = talloc_zero(sd, struct wipedbs_record_marker);`
			`if (rec == NULL) {`
			`DEBUG(0, ("Out of memory!\n"));`
			`goto done;`
			`}`

			`tmp = dbwrap_record_get_key(open->db_rec);`
			`rec->key = tdb_data_talloc_copy(rec, tmp);`
			`tmp = dbwrap_record_get_value(open->db_rec);`
			`rec->val = tdb_data_talloc_copy(rec, tmp);`

			`rec->desc = talloc_asprintf(`
Fix the build: net_serverid.c has 3 wrong format strings for 64bit vars On two of my opensuse machines i get 3 errors, e.g.: ../source3/utils/net_serverid.c:333:3: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘uint64_t’ [-Werror=format] cc1: some warnings being treated as errors Signed-off-by: Guenter Kukkukk <kukks@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Tue Mar 5 22:49:03 CET 2013 on sn-devel-104 2013-03-05 23:08:49 +04:00			`rec, "open[global: %u persistent: %llu volatile: %llu]",`
			`open->open_global_id,`
			`(long long unsigned)open->open_persistent_id,`
			`(long long unsigned)open->open_volatile_id);`
s3:net: new implementation of "servid wipedbs" with smbXsrv_* Signed-off-by: Gregor Beck <gbeck@sernet.de> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Tue Feb 19 13:56:57 CET 2013 on sn-devel-104 2012-12-13 16:00:28 +04:00
			`if ((rec->key.dptr == NULL) \|\| (rec->val.dptr == NULL) \|\|`
			`(rec->desc == NULL))`
			`{`
			`DEBUG(0, ("Out of memory!\n"));`
			`goto done;`
			`}`

			`state->open_db = dbwrap_record_get_db(open->db_rec);`

			`DLIST_ADD(sd->open_records, rec);`
			`ret = 0;`

			`done:`
			`return ret;`
			`}`

			`static int wipedbs_traverse_nop(struct db_record rec, void private_data)`
			`{`
			`return 0;`
			`}`

			`static int wipedbs_traverse_fill_ids(struct db_record rec, void wipedbs_state)`
			`{`
			`struct wipedbs_state *state = talloc_get_type_abort(`
			`wipedbs_state, struct wipedbs_state);`

			`TDB_DATA val = dbwrap_record_get_value(rec);`

			`struct wipedbs_server_data *sd = talloc_get_type_abort(`
			`(void*)val.dptr, struct wipedbs_server_data);`

			`state->server_ids[state->idx] = sd->server_id;`
			`state->idx++;`
			`return 0;`
			`}`

			`static int wipedbs_traverse_set_exists(struct db_record *rec,`
			`void *wipedbs_state)`
			`{`
			`struct wipedbs_state *state = talloc_get_type_abort(`
			`wipedbs_state, struct wipedbs_state);`

			`TDB_DATA val = dbwrap_record_get_value(rec);`

			`struct wipedbs_server_data *sd = talloc_get_type_abort(`
			`(void*)val.dptr, struct wipedbs_server_data);`

			`/* assume a stable traverse order for rbt */`
			`SMB_ASSERT(server_id_equal(&state->server_ids[state->idx],`
			`&sd->server_id));`
			`sd->exists = state->server_exists[state->idx];`

			`if (sd->exists) {`
			`state->stat.server.existing++;`
			`}`
			`if (server_id_is_disconnected(&sd->server_id)) {`
			`state->stat.server.disconnected++;`
			`}`

			`state->idx++;`
			`return 0;`
			`}`

			`static NTSTATUS wipedbs_check_server_exists(struct wipedbs_state *state)`
			`{`
			`NTSTATUS status;`
			`bool ok;`
			`int num_servers;`

			`status = dbwrap_traverse_read(state->id2server_data,`
			`wipedbs_traverse_nop, NULL, &num_servers);`
			`if (!NT_STATUS_IS_OK(status)) {`
			`DEBUG(0, ("Failed to traverse temporary database\n"));`
			`goto done;`
			`}`
			`state->stat.server.total = num_servers;`

			`state->server_ids = talloc_array(state, struct server_id, num_servers);`
			`state->server_exists = talloc_array(state, bool, num_servers);`
			`if (state->server_ids == NULL \|\| state->server_exists == NULL) {`
			`DEBUG(0, ("Out of memory\n"));`
			`goto done;`
			`}`

			`state->idx = 0;`
			`status = dbwrap_traverse_read(state->id2server_data,`
			`wipedbs_traverse_fill_ids,`
			`state, NULL);`
			`if (!NT_STATUS_IS_OK(status)) {`
			`DEBUG(0, ("Failed to traverse temporary database\n"));`
			`goto done;`
			`}`

			`ok = serverids_exist(state->server_ids, num_servers, state->server_exists);`
			`if (!ok) {`
			`DEBUG(0, ("Calling serverids_exist failed\n"));`
			`status = NT_STATUS_UNSUCCESSFUL;`
			`goto done;`
			`}`

			`state->idx = 0;`
			`status = dbwrap_traverse_read(state->id2server_data,`
			`wipedbs_traverse_set_exists, state, NULL);`
			`if (!NT_STATUS_IS_OK(status)) {`
			`DEBUG(0, ("Failed to traverse temporary database\n"));`
			`goto done;`
			`}`
			`done:`
			`TALLOC_FREE(state->server_ids);`
			`TALLOC_FREE(state->server_exists);`
			`return status;`
			`}`

			`static int wipedbs_delete_records(struct db_context *db,`
			`struct wipedbs_record_marker *records,`
			`bool dry_run, bool verbose, int *count)`
			`{`
			`struct wipedbs_record_marker *cur;`
			`struct db_record *rec;`
			`TDB_DATA val;`
			`NTSTATUS status;`
			`unsigned num=0, total=0;`

			`if (db == NULL) {`
			`return 0;`
			`}`

			`for (cur = records; cur != NULL; cur = cur->next) {`
			`total++;`
			`rec = dbwrap_fetch_locked(db, talloc_tos(), cur->key);`
			`if (rec == NULL) {`
			`DEBUG(0, ("Failed to fetch record <%s> from %s",`
			`cur->desc, dbwrap_name(db)));`
			`continue;`
			`}`
			`val = dbwrap_record_get_value(rec);`
			`if (tdb_data_equal(val, cur->val)) {`
			`if (dry_run) {`
			`status = NT_STATUS_OK;`
			`} else {`
			`status = dbwrap_record_delete(rec);`
			`}`
			`if (NT_STATUS_IS_OK(status)) {`
			`num ++;`
			`} else {`
			`DEBUG(0, ("Failed to delete record <%s> from %s"`
			`": %s\n", cur->desc, dbwrap_name(db),`
			`nt_errstr(status)));`
			`}`
			`} else {`
			`DEBUG(0, ("Warning: record <%s> from %s changed"`
			`", skip record!\n",`
			`cur->desc, dbwrap_name(db)));`
			`}`
			`if (verbose) {`
			`d_printf("deleting %s\n", cur->desc);`
			`}`
			`TALLOC_FREE(rec);`
			`}`

			`if (verbose) {`
			`d_printf("Deleted %u of %u records from %s\n",`
			`num, total, dbwrap_name(db));`
			`}`

			`if (count) {`
			`*count += total;`
			`}`

			`return total - num;`
			`}`

			`static int wipedbs_traverse_server_data(struct db_record *rec,`
			`void *wipedbs_state)`
			`{`
			`struct wipedbs_state *state = talloc_get_type_abort(`
			`wipedbs_state, struct wipedbs_state);`
			`bool dry_run = state->testmode;`
			`TDB_DATA val = dbwrap_record_get_value(rec);`
			`int ret;`
			`struct wipedbs_server_data *sd = talloc_get_type_abort(`
			`(void*)val.dptr, struct wipedbs_server_data);`

			`if (state->verbose) {`
			`d_printf("Server: '%s' %s\n", sd->server_id_str,`
			`sd->exists ?`
			`"exists" :`
			`"does not exist, cleaning up...");`
			`}`

			`if (sd->exists) {`
			`return 0;`
			`}`

			`ret = wipedbs_delete_records(state->session_db, sd->session_records,`
			`dry_run, state->verbose,`
			`&state->stat.session.todelete);`
			`state->stat.session.failure += ret;`

			`ret = wipedbs_delete_records(state->tcon_db, sd->tcon_records,`
			`dry_run, state->verbose,`
			`&state->stat.tcon.todelete);`
			`state->stat.tcon.failure += ret;`

			`ret = wipedbs_delete_records(state->open_db, sd->open_records,`
			`dry_run, state->verbose,`
			`&state->stat.open.todelete);`
			`state->stat.open.failure += ret;`

			`return 0;`
			`}`

s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`static int net_serverid_wipedbs(struct net_context *c, int argc,`
			`const char **argv)`
			`{`
s3:net: new implementation of "servid wipedbs" with smbXsrv_* Signed-off-by: Gregor Beck <gbeck@sernet.de> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Tue Feb 19 13:56:57 CET 2013 on sn-devel-104 2012-12-13 16:00:28 +04:00			`int ret = -1;`
			`NTSTATUS status;`
			`struct wipedbs_state *state = talloc_zero(talloc_tos(),`
			`struct wipedbs_state);`

			`if (c->display_usage) {`
			`d_printf("%s\n%s",`
			`_("Usage:"),`
			`_("net serverid wipedbs [--test] [--verbose]\n"));`
			`d_printf("%s\n%s",`
			`_("Example:"),`
			`_("net serverid wipedbs -v\n"));`
			`return -1;`
			`}`

			`state->now = timeval_current();`
			`state->testmode = c->opt_testmode;`
			`state->verbose = c->opt_verbose;`

			`state->id2server_data = db_open_rbt(state);`
			`if (state->id2server_data == NULL) {`
			`DEBUG(0, ("Failed to open temporary database\n"));`
			`goto done;`
			`}`

			`status = smbXsrv_session_global_traverse(wipedbs_traverse_sessions,`
			`state);`
			`if (!NT_STATUS_IS_OK(status)) {`
			`goto done;`
			`}`

			`status = smbXsrv_tcon_global_traverse(wipedbs_traverse_tcon, state);`
			`if (!NT_STATUS_IS_OK(status)) {`
			`goto done;`
			`}`

			`status = smbXsrv_open_global_traverse(wipedbs_traverse_open, state);`
			`if (!NT_STATUS_IS_OK(status)) {`
			`goto done;`
			`}`

			`status = wipedbs_check_server_exists(state);`
			`if (!NT_STATUS_IS_OK(status)) {`
			`goto done;`
			`}`

			`status = dbwrap_traverse_read(state->id2server_data,`
			`wipedbs_traverse_server_data,`
			`state, NULL);`
			`if (!NT_STATUS_IS_OK(status)) {`
			`DEBUG(0, ("Failed to traverse db: %s\n", nt_errstr(status)));`
			`goto done;`
			`}`

			`d_printf("Found %d serverids, %d alive and %d disconnected\n",`
			`state->stat.server.total,`
			`state->stat.server.existing,`
			`state->stat.server.disconnected);`
			`d_printf("Found %d sessions, %d alive and %d disconnected"`
			`", cleaned up %d of %d entries\n",`
			`state->stat.session.total,`
			`state->stat.session.total - state->stat.session.todelete,`
			`state->stat.session.disconnected,`
			`state->stat.session.todelete - state->stat.session.failure,`
			`state->stat.session.todelete);`
			`d_printf("Found %d tcons, %d alive and %d disconnected"`
			`", cleaned up %d of %d entries\n",`
			`state->stat.tcon.total,`
			`state->stat.tcon.total - state->stat.tcon.todelete,`
			`state->stat.tcon.disconnected,`
			`state->stat.tcon.todelete - state->stat.tcon.failure,`
			`state->stat.tcon.todelete);`
			`d_printf("Found %d opens, %d alive, %d disconnected and %d timed out"`
			`", cleaned up %d of %d entries\n",`
			`state->stat.open.total,`
			`state->stat.open.total - state->stat.open.todelete`
			`- (state->stat.open.disconnected - state->stat.open_timed_out),`
			`state->stat.open.disconnected,`
			`state->stat.open_timed_out,`
			`state->stat.open.todelete - state->stat.open.failure,`
			`state->stat.open.todelete);`

			`ret = 0;`
			`done:`
			`talloc_free(state);`
			`return ret;`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`}`

			`int net_serverid(struct net_context c, int argc, const char *argv)`
			`{`
			`struct functable func[] = {`
			`{`
			`"list",`
			`net_serverid_list,`
			`NET_TRANSPORT_LOCAL,`
			`N_("List all entries from serverid.tdb"),`
			`N_("net serverid list\n"`
			`" List all entries from serverid.tdb")`
			`},`
			`{`
			`"wipe",`
			`net_serverid_wipe,`
			`NET_TRANSPORT_LOCAL,`
			`N_("Wipe the serverid.tdb for the current node"),`
			`N_("net serverid wipe\n"`
			`" Wipe the serverid.tdb for the current node")`
			`},`
			`{`
			`"wipedbs",`
			`net_serverid_wipedbs,`
			`NET_TRANSPORT_LOCAL,`
s3:net_serverid: remove connections_forall from "net serverid wipedbs" This tdb will go away. Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Michael Adam <obnox@samba.org> 2012-08-23 16:02:22 +04:00			`N_("Clean dead entries from temporary databases"),`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`N_("net serverid wipedbs\n"`
s3:net_serverid: remove connections_forall from "net serverid wipedbs" This tdb will go away. Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Michael Adam <obnox@samba.org> 2012-08-23 16:02:22 +04:00			`" Clean dead entries from temporary databases")`
s3: Fix a long-standing problem with recycled PIDs When a samba server process dies hard, it has no chance to clean up its entries in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb. For locking.tdb and brlock.tdb Samba is robust by checking every time we read an entry from the database if the corresponding process still exists. If it does not exist anymore, the entry is deleted. This is not 100% failsafe though: On systems with a limited PID space there is a non-zero chance that between the smbd's death and the fresh access, the PID is recycled by another long-running process. This renders all files that had been locked by the killed smbd potentially unusable until the new process also dies. This patch is supposed to fix the problem the following way: Every process ID in every database is augmented by a random 64-bit number that is stored in a serverid.tdb. Whenever we need to check if a process still exists we know its PID and the 64-bit number. We look up the PID in serverid.tdb and compare the 64-bit number. If it's the same, the process still is a valid smbd holding the lock. If it is different, a new smbd has taken over. I believe this is safe against an smbd that has died hard and the PID has been taken over by a non-samba process. This process would not have registered itself with a fresh 64-bit number in serverid.tdb, so the old one still exists in serverid.tdb. We protect against this case by the parent smbd taking care of deregistering PIDs from serverid.tdb and the fact that serverid.tdb is CLEAR_IF_FIRST. CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not work when all smbds are restarted. For this, "net serverid wipe" has to be run before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up sessionid.tdb and connections.tdb. While there, this also cleans up overloading connections.tdb with all the process entries just for messaging_send_all(). Volker 2010-03-02 19:02:01 +03:00			`},`
			`{NULL, NULL, 0, NULL, NULL}`
			`};`

			`return net_run_function(c, argc, argv, "net serverid", func);`
			`}`

672 lines 17 KiB C Raw Normal View History Unescape Escape

672 lines

17 KiB

C

Raw Normal View History