s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
/*
Unix SMB / CIFS implementation .
Implementation of a reliable server_exists ( )
Copyright ( C ) Volker Lendecke 2010
This program is free software ; you can redistribute it and / or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation ; either version 3 of the License , or
( at your option ) any later version .
This program is distributed in the hope that it will be useful ,
but WITHOUT ANY WARRANTY ; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE . See the
GNU General Public License for more details .
You should have received a copy of the GNU General Public License
along with this program . If not , see < http : //www.gnu.org/licenses/>.
*/
# include "includes.h"
2011-02-26 01:20:06 +03:00
# include "system/filesys.h"
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
# include "serverid.h"
2011-05-05 13:25:29 +04:00
# include "util_tdb.h"
2011-07-07 19:42:08 +04:00
# include "dbwrap/dbwrap.h"
2011-07-06 18:40:21 +04:00
# include "dbwrap/dbwrap_open.h"
2011-05-04 04:28:15 +04:00
# include "lib/util/tdb_wrap.h"
2011-10-12 16:01:08 +04:00
# include "lib/param/param.h"
2011-10-31 19:30:38 +04:00
# include "ctdbd_conn.h"
# include "messages.h"
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
struct serverid_key {
pid_t pid ;
2011-05-02 04:27:36 +04:00
uint32_t task_id ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
uint32_t vnn ;
} ;
2010-07-04 22:31:02 +04:00
struct serverid_data {
uint64_t unique_id ;
uint32_t msg_flags ;
} ;
2010-09-26 02:59:06 +04:00
bool serverid_parent_init ( TALLOC_CTX * mem_ctx )
2010-03-25 18:02:54 +03:00
{
struct tdb_wrap * db ;
2011-10-12 16:01:08 +04:00
struct loadparm_context * lp_ctx ;
lp_ctx = loadparm_init_s3 ( mem_ctx , loadparm_s3_context ( ) ) ;
if ( lp_ctx = = NULL ) {
DEBUG ( 0 , ( " loadparm_init_s3 failed \n " ) ) ;
return false ;
}
2010-03-25 18:02:54 +03:00
2010-03-25 18:44:41 +03:00
/*
* Open the tdb in the parent process ( smbd ) so that our
* CLEAR_IF_FIRST optimization in tdb_reopen_all can properly
* work .
*/
2010-09-26 02:59:06 +04:00
db = tdb_wrap_open ( mem_ctx , lock_path ( " serverid.tdb " ) ,
2010-09-27 16:46:07 +04:00
0 , TDB_DEFAULT | TDB_CLEAR_IF_FIRST | TDB_INCOMPATIBLE_HASH , O_RDWR | O_CREAT ,
2011-10-12 16:01:08 +04:00
0644 , lp_ctx ) ;
talloc_unlink ( mem_ctx , lp_ctx ) ;
2010-03-25 18:02:54 +03:00
if ( db = = NULL ) {
DEBUG ( 1 , ( " could not open serverid.tdb: %s \n " ,
strerror ( errno ) ) ) ;
return false ;
}
return true ;
}
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
static struct db_context * serverid_db ( void )
{
static struct db_context * db ;
if ( db ! = NULL ) {
return db ;
}
2010-09-26 03:02:04 +04:00
db = db_open ( NULL , lock_path ( " serverid.tdb " ) , 0 ,
2010-09-27 16:46:07 +04:00
TDB_DEFAULT | TDB_CLEAR_IF_FIRST | TDB_INCOMPATIBLE_HASH , O_RDWR | O_CREAT , 0644 ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
return db ;
}
static void serverid_fill_key ( const struct server_id * id ,
struct serverid_key * key )
{
ZERO_STRUCTP ( key ) ;
key - > pid = id - > pid ;
2011-05-02 04:27:36 +04:00
key - > task_id = id - > task_id ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
key - > vnn = id - > vnn ;
}
2010-07-04 18:08:03 +04:00
bool serverid_register ( const struct server_id id , uint32_t msg_flags )
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
{
struct db_context * db ;
struct serverid_key key ;
struct serverid_data data ;
struct db_record * rec ;
TDB_DATA tdbkey , tdbdata ;
NTSTATUS status ;
bool ret = false ;
db = serverid_db ( ) ;
if ( db = = NULL ) {
return false ;
}
2010-07-04 18:08:03 +04:00
serverid_fill_key ( & id , & key ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
tdbkey = make_tdb_data ( ( uint8_t * ) & key , sizeof ( key ) ) ;
2011-08-25 00:28:40 +04:00
rec = dbwrap_fetch_locked ( db , talloc_tos ( ) , tdbkey ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
if ( rec = = NULL ) {
DEBUG ( 1 , ( " Could not fetch_lock serverid.tdb record \n " ) ) ;
return false ;
}
ZERO_STRUCT ( data ) ;
2010-07-04 18:08:03 +04:00
data . unique_id = id . unique_id ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
data . msg_flags = msg_flags ;
tdbdata = make_tdb_data ( ( uint8_t * ) & data , sizeof ( data ) ) ;
2011-08-25 00:28:40 +04:00
status = dbwrap_record_store ( rec , tdbdata , 0 ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
if ( ! NT_STATUS_IS_OK ( status ) ) {
DEBUG ( 1 , ( " Storing serverid.tdb record failed: %s \n " ,
nt_errstr ( status ) ) ) ;
goto done ;
}
2011-10-31 19:30:38 +04:00
# ifdef HAVE_CTDB_CONTROL_CHECK_SRVIDS_DECL
if ( lp_clustering ( ) ) {
register_with_ctdbd ( messaging_ctdbd_connection ( ) , id . unique_id ) ;
}
# endif
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
ret = true ;
done :
TALLOC_FREE ( rec ) ;
return ret ;
}
2010-07-04 22:40:46 +04:00
bool serverid_register_msg_flags ( const struct server_id id , bool do_reg ,
uint32_t msg_flags )
{
struct db_context * db ;
struct serverid_key key ;
struct serverid_data * data ;
struct db_record * rec ;
2010-08-18 15:20:50 +04:00
TDB_DATA tdbkey ;
2011-08-25 00:28:40 +04:00
TDB_DATA value ;
2010-07-04 22:40:46 +04:00
NTSTATUS status ;
bool ret = false ;
db = serverid_db ( ) ;
if ( db = = NULL ) {
return false ;
}
serverid_fill_key ( & id , & key ) ;
tdbkey = make_tdb_data ( ( uint8_t * ) & key , sizeof ( key ) ) ;
2011-08-25 00:28:40 +04:00
rec = dbwrap_fetch_locked ( db , talloc_tos ( ) , tdbkey ) ;
2010-07-04 22:40:46 +04:00
if ( rec = = NULL ) {
DEBUG ( 1 , ( " Could not fetch_lock serverid.tdb record \n " ) ) ;
return false ;
}
2011-08-25 00:28:40 +04:00
value = dbwrap_record_get_value ( rec ) ;
if ( value . dsize ! = sizeof ( struct serverid_data ) ) {
2010-07-04 22:40:46 +04:00
DEBUG ( 1 , ( " serverid record has unexpected size %d "
2011-08-25 00:28:40 +04:00
" (wanted %d) \n " , ( int ) value . dsize ,
2010-07-21 02:12:07 +04:00
( int ) sizeof ( struct serverid_data ) ) ) ;
2010-07-04 22:40:46 +04:00
goto done ;
}
2011-08-25 00:28:40 +04:00
data = ( struct serverid_data * ) value . dptr ;
2010-07-04 22:40:46 +04:00
if ( do_reg ) {
data - > msg_flags | = msg_flags ;
} else {
data - > msg_flags & = ~ msg_flags ;
}
2011-08-25 00:28:40 +04:00
status = dbwrap_record_store ( rec , value , 0 ) ;
2010-07-04 22:40:46 +04:00
if ( ! NT_STATUS_IS_OK ( status ) ) {
DEBUG ( 1 , ( " Storing serverid.tdb record failed: %s \n " ,
nt_errstr ( status ) ) ) ;
goto done ;
}
ret = true ;
done :
TALLOC_FREE ( rec ) ;
return ret ;
}
2010-07-04 18:08:03 +04:00
bool serverid_deregister ( struct server_id id )
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
{
struct db_context * db ;
struct serverid_key key ;
struct db_record * rec ;
TDB_DATA tdbkey ;
NTSTATUS status ;
bool ret = false ;
db = serverid_db ( ) ;
if ( db = = NULL ) {
return false ;
}
2010-07-04 18:08:03 +04:00
serverid_fill_key ( & id , & key ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
tdbkey = make_tdb_data ( ( uint8_t * ) & key , sizeof ( key ) ) ;
2011-08-25 00:28:40 +04:00
rec = dbwrap_fetch_locked ( db , talloc_tos ( ) , tdbkey ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
if ( rec = = NULL ) {
DEBUG ( 1 , ( " Could not fetch_lock serverid.tdb record \n " ) ) ;
return false ;
}
2011-08-25 00:28:40 +04:00
status = dbwrap_record_delete ( rec ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
if ( ! NT_STATUS_IS_OK ( status ) ) {
DEBUG ( 1 , ( " Deleting serverid.tdb record failed: %s \n " ,
nt_errstr ( status ) ) ) ;
goto done ;
}
ret = true ;
done :
TALLOC_FREE ( rec ) ;
return ret ;
}
struct serverid_exists_state {
const struct server_id * id ;
bool exists ;
} ;
static int server_exists_parse ( TDB_DATA key , TDB_DATA data , void * priv )
{
struct serverid_exists_state * state =
( struct serverid_exists_state * ) priv ;
if ( data . dsize ! = sizeof ( struct serverid_data ) ) {
return - 1 ;
}
2010-07-04 16:35:05 +04:00
/*
* Use memcmp , not direct compare . data . dptr might not be
* aligned .
*/
2010-07-04 16:59:23 +04:00
state - > exists = ( memcmp ( & state - > id - > unique_id , data . dptr ,
2010-07-04 22:04:55 +04:00
sizeof ( state - > id - > unique_id ) ) = = 0 ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
return 0 ;
}
bool serverid_exists ( const struct server_id * id )
{
struct db_context * db ;
struct serverid_exists_state state ;
struct serverid_key key ;
TDB_DATA tdbkey ;
2011-08-22 12:21:09 +04:00
if ( procid_is_me ( id ) ) {
2011-08-19 21:32:29 +04:00
return true ;
}
if ( ! process_exists ( * id ) ) {
2010-12-03 11:34:02 +03:00
return false ;
}
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
db = serverid_db ( ) ;
if ( db = = NULL ) {
return false ;
}
serverid_fill_key ( id , & key ) ;
tdbkey = make_tdb_data ( ( uint8_t * ) & key , sizeof ( key ) ) ;
state . id = id ;
state . exists = false ;
2011-08-25 00:28:40 +04:00
if ( dbwrap_parse_record ( db , tdbkey , server_exists_parse , & state ) ! = 0 ) {
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
return false ;
}
return state . exists ;
}
2011-10-26 15:36:56 +04:00
bool serverids_exist ( const struct server_id * ids , int num_ids , bool * results )
{
struct db_context * db ;
int i ;
2011-10-31 19:30:38 +04:00
# ifdef HAVE_CTDB_CONTROL_CHECK_SRVIDS_DECL
if ( lp_clustering ( ) ) {
return ctdb_serverids_exist ( messaging_ctdbd_connection ( ) ,
ids , num_ids , results ) ;
}
# endif
2011-10-26 15:36:56 +04:00
if ( ! processes_exist ( ids , num_ids , results ) ) {
return false ;
}
db = serverid_db ( ) ;
if ( db = = NULL ) {
return false ;
}
for ( i = 0 ; i < num_ids ; i + + ) {
struct serverid_exists_state state ;
struct serverid_key key ;
TDB_DATA tdbkey ;
if ( ! results [ i ] ) {
continue ;
}
serverid_fill_key ( & ids [ i ] , & key ) ;
tdbkey = make_tdb_data ( ( uint8_t * ) & key , sizeof ( key ) ) ;
state . id = & ids [ i ] ;
state . exists = false ;
dbwrap_parse_record ( db , tdbkey , server_exists_parse , & state ) ;
results [ i ] = state . exists ;
}
return true ;
}
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
static bool serverid_rec_parse ( const struct db_record * rec ,
struct server_id * id , uint32_t * msg_flags )
{
struct serverid_key key ;
struct serverid_data data ;
2011-08-25 00:28:40 +04:00
TDB_DATA tdbkey ;
TDB_DATA tdbdata ;
tdbkey = dbwrap_record_get_key ( rec ) ;
tdbdata = dbwrap_record_get_value ( rec ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
2011-08-25 00:28:40 +04:00
if ( tdbkey . dsize ! = sizeof ( key ) ) {
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
DEBUG ( 1 , ( " Found invalid key length %d in serverid.tdb \n " ,
2011-08-25 00:28:40 +04:00
( int ) tdbkey . dsize ) ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
return false ;
}
2011-08-25 00:28:40 +04:00
if ( tdbdata . dsize ! = sizeof ( data ) ) {
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
DEBUG ( 1 , ( " Found invalid value length %d in serverid.tdb \n " ,
2011-08-25 00:28:40 +04:00
( int ) tdbdata . dsize ) ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
return false ;
}
2011-08-25 00:28:40 +04:00
memcpy ( & key , tdbkey . dptr , sizeof ( key ) ) ;
memcpy ( & data , tdbdata . dptr , sizeof ( data ) ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
id - > pid = key . pid ;
2011-05-02 04:27:36 +04:00
id - > task_id = key . task_id ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
id - > vnn = key . vnn ;
id - > unique_id = data . unique_id ;
* msg_flags = data . msg_flags ;
return true ;
}
struct serverid_traverse_read_state {
int ( * fn ) ( const struct server_id * id , uint32_t msg_flags ,
void * private_data ) ;
void * private_data ;
} ;
static int serverid_traverse_read_fn ( struct db_record * rec , void * private_data )
{
struct serverid_traverse_read_state * state =
( struct serverid_traverse_read_state * ) private_data ;
struct server_id id ;
uint32_t msg_flags ;
if ( ! serverid_rec_parse ( rec , & id , & msg_flags ) ) {
return 0 ;
}
return state - > fn ( & id , msg_flags , state - > private_data ) ;
}
bool serverid_traverse_read ( int ( * fn ) ( const struct server_id * id ,
uint32_t msg_flags , void * private_data ) ,
void * private_data )
{
struct db_context * db ;
struct serverid_traverse_read_state state ;
2011-08-17 12:08:31 +04:00
NTSTATUS status ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
db = serverid_db ( ) ;
if ( db = = NULL ) {
return false ;
}
state . fn = fn ;
state . private_data = private_data ;
2011-08-17 12:08:31 +04:00
status = dbwrap_traverse_read ( db , serverid_traverse_read_fn , & state ,
NULL ) ;
return NT_STATUS_IS_OK ( status ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
}
struct serverid_traverse_state {
int ( * fn ) ( struct db_record * rec , const struct server_id * id ,
uint32_t msg_flags , void * private_data ) ;
void * private_data ;
} ;
static int serverid_traverse_fn ( struct db_record * rec , void * private_data )
{
struct serverid_traverse_state * state =
( struct serverid_traverse_state * ) private_data ;
struct server_id id ;
uint32_t msg_flags ;
if ( ! serverid_rec_parse ( rec , & id , & msg_flags ) ) {
return 0 ;
}
return state - > fn ( rec , & id , msg_flags , state - > private_data ) ;
}
bool serverid_traverse ( int ( * fn ) ( struct db_record * rec ,
const struct server_id * id ,
uint32_t msg_flags , void * private_data ) ,
void * private_data )
{
struct db_context * db ;
struct serverid_traverse_state state ;
2011-08-17 12:06:07 +04:00
NTSTATUS status ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
db = serverid_db ( ) ;
if ( db = = NULL ) {
return false ;
}
state . fn = fn ;
state . private_data = private_data ;
2011-08-17 12:06:07 +04:00
status = dbwrap_traverse ( db , serverid_traverse_fn , & state , NULL ) ;
return NT_STATUS_IS_OK ( status ) ;
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
2010-03-02 19:02:01 +03:00
}