2005-04-17 02:20:36 +04:00
/*
* linux / fs / lockd / clntlock . c
*
* Lock handling for the client side NLM implementation
*
* Copyright ( C ) 1996 , Olaf Kirch < okir @ monad . swb . de >
*/
# include <linux/module.h>
# include <linux/types.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 11:04:11 +03:00
# include <linux/slab.h>
2005-04-17 02:20:36 +04:00
# include <linux/time.h>
# include <linux/nfs_fs.h>
# include <linux/sunrpc/clnt.h>
# include <linux/sunrpc/svc.h>
# include <linux/lockd/lockd.h>
2008-12-23 23:21:33 +03:00
# include <linux/kthread.h>
2005-04-17 02:20:36 +04:00
# define NLMDBG_FACILITY NLMDBG_CLIENT
/*
* Local function prototypes
*/
static int reclaimer ( void * ptr ) ;
/*
* The following functions handle blocking and granting from the
* client perspective .
*/
/*
* This is the representation of a blocked client lock .
*/
struct nlm_wait {
2005-06-22 21:16:31 +04:00
struct list_head b_list ; /* linked list */
2005-04-17 02:20:36 +04:00
wait_queue_head_t b_wait ; /* where to wait on */
struct nlm_host * b_host ;
struct file_lock * b_lock ; /* local file lock */
unsigned short b_reclaim ; /* got to reclaim lock */
2006-12-13 11:35:03 +03:00
__be32 b_status ; /* grant callback status */
2005-04-17 02:20:36 +04:00
} ;
2005-06-22 21:16:31 +04:00
static LIST_HEAD ( nlm_blocked ) ;
2010-09-22 17:50:35 +04:00
static DEFINE_SPINLOCK ( nlm_blocked_lock ) ;
2005-04-17 02:20:36 +04:00
2008-01-12 01:09:44 +03:00
/**
* nlmclnt_init - Set up per - NFS mount point lockd data structures
2008-01-16 00:04:20 +03:00
* @ nlm_init : pointer to arguments structure
2008-01-12 01:09:44 +03:00
*
* Returns pointer to an appropriate nlm_host struct ,
* or an ERR_PTR value .
*/
2008-01-16 00:04:20 +03:00
struct nlm_host * nlmclnt_init ( const struct nlmclnt_initdata * nlm_init )
2008-01-12 01:09:44 +03:00
{
struct nlm_host * host ;
2008-01-16 00:04:20 +03:00
u32 nlm_version = ( nlm_init - > nfs_version = = 2 ) ? 1 : 4 ;
2008-01-12 01:09:44 +03:00
int status ;
2008-10-04 01:15:30 +04:00
status = lockd_up ( ) ;
2008-01-12 01:09:44 +03:00
if ( status < 0 )
return ERR_PTR ( status ) ;
2008-10-03 20:50:21 +04:00
host = nlmclnt_lookup_host ( nlm_init - > address , nlm_init - > addrlen ,
2008-01-16 00:04:20 +03:00
nlm_init - > protocol , nlm_version ,
2008-12-23 23:21:38 +03:00
nlm_init - > hostname , nlm_init - > noresvport ) ;
2008-01-12 01:09:44 +03:00
if ( host = = NULL ) {
lockd_down ( ) ;
return ERR_PTR ( - ENOLCK ) ;
}
return host ;
}
EXPORT_SYMBOL_GPL ( nlmclnt_init ) ;
/**
* nlmclnt_done - Release resources allocated by nlmclnt_init ( )
* @ host : nlm_host structure reserved by nlmclnt_init ( )
*
*/
void nlmclnt_done ( struct nlm_host * host )
{
lockd: Create client-side nlm_host cache
NFS clients don't need the garbage collection processing that is
performed on nlm_host structures. The client picks up an nlm_host at
mount time and holds a reference to it until the file system is
unmounted.
Servers, on the other hand, don't have a precise way to tell when an
nlm_host is no longer being used, so zero refcount nlm_host entries
are left to expire in the cache after a time.
Basically there's nothing holding a reference to an nlm_host between
individual server-side NLM requests, but we can't afford the expense
of recreating them for every new NLM request from a client. The
nlm_host cache adds some lifetime hysteresis to entries in the cache
so the next time a particular nlm_host is needed, it's likely to be
discovered by a lookup rather than created from whole cloth.
With the new implementation, client nlm_host cache items are no longer
garbage collected, and are destroyed directly by a new release
function specialized for client entries, nlmclnt_release_host(). They
are cached in their own data structure, and have their own lookup
logic, simplified and specialized for client nlm_host entries.
However, the client nlm_host cache still shares reboot recovery logic
with the server nlm_host cache. The NSM "peer rebooted" downcall for
clients and servers still come through the same RPC call. This is a
legacy formal API that would be difficult to alter, and besides, the
user space NSM implementation can't tell the difference between peers
that are clients or servers.
For this reason, the client cache continues to share the
nlm_host_mutex (and reboot recovery logic) with the server cache.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-14 18:05:52 +03:00
nlmclnt_release_host ( host ) ;
2008-01-12 01:09:44 +03:00
lockd_down ( ) ;
}
EXPORT_SYMBOL_GPL ( nlmclnt_done ) ;
2005-04-17 02:20:36 +04:00
/*
2005-06-22 21:16:31 +04:00
* Queue up a lock for blocking so that the GRANTED request can see it
2005-04-17 02:20:36 +04:00
*/
2006-03-20 21:44:44 +03:00
struct nlm_wait * nlmclnt_prepare_block ( struct nlm_host * host , struct file_lock * fl )
2005-06-22 21:16:31 +04:00
{
struct nlm_wait * block ;
block = kmalloc ( sizeof ( * block ) , GFP_KERNEL ) ;
2006-03-20 21:44:44 +03:00
if ( block ! = NULL ) {
block - > b_host = host ;
block - > b_lock = fl ;
init_waitqueue_head ( & block - > b_wait ) ;
2006-12-13 11:35:03 +03:00
block - > b_status = nlm_lck_blocked ;
2010-09-22 17:50:35 +04:00
spin_lock ( & nlm_blocked_lock ) ;
2006-03-20 21:44:44 +03:00
list_add ( & block - > b_list , & nlm_blocked ) ;
2010-09-22 17:50:35 +04:00
spin_unlock ( & nlm_blocked_lock ) ;
2006-03-20 21:44:44 +03:00
}
return block ;
2005-06-22 21:16:31 +04:00
}
2006-03-20 21:44:44 +03:00
void nlmclnt_finish_block ( struct nlm_wait * block )
2005-04-17 02:20:36 +04:00
{
2005-06-22 21:16:31 +04:00
if ( block = = NULL )
return ;
2010-09-22 17:50:35 +04:00
spin_lock ( & nlm_blocked_lock ) ;
2005-06-22 21:16:31 +04:00
list_del ( & block - > b_list ) ;
2010-09-22 17:50:35 +04:00
spin_unlock ( & nlm_blocked_lock ) ;
2005-06-22 21:16:31 +04:00
kfree ( block ) ;
}
2005-04-17 02:20:36 +04:00
2005-06-22 21:16:31 +04:00
/*
* Block on a lock
*/
2006-03-20 21:44:44 +03:00
int nlmclnt_block ( struct nlm_wait * block , struct nlm_rqst * req , long timeout )
2005-06-22 21:16:31 +04:00
{
long ret ;
2005-04-17 02:20:36 +04:00
2005-06-22 21:16:31 +04:00
/* A borken server might ask us to block even if we didn't
* request it . Just say no !
*/
2006-03-20 21:44:44 +03:00
if ( block = = NULL )
2005-06-22 21:16:31 +04:00
return - EAGAIN ;
2005-04-17 02:20:36 +04:00
/* Go to sleep waiting for GRANT callback. Some servers seem
* to lose callbacks , however , so we ' re going to poll from
* time to time just to make sure .
*
* For now , the retry frequency is pretty high ; normally
* a 1 minute timeout would do . See the comment before
* nlmclnt_lock for an explanation .
*/
2005-06-22 21:16:31 +04:00
ret = wait_event_interruptible_timeout ( block - > b_wait ,
2006-12-13 11:35:03 +03:00
block - > b_status ! = nlm_lck_blocked ,
2005-06-22 21:16:31 +04:00
timeout ) ;
2006-03-20 21:44:44 +03:00
if ( ret < 0 )
return - ERESTARTSYS ;
req - > a_res . status = block - > b_status ;
return 0 ;
2005-04-17 02:20:36 +04:00
}
/*
* The server lockd has called us back to tell us the lock was granted
*/
2008-10-03 20:50:36 +04:00
__be32 nlmclnt_grant ( const struct sockaddr * addr , const struct nlm_lock * lock )
2005-04-17 02:20:36 +04:00
{
2006-02-15 00:53:04 +03:00
const struct file_lock * fl = & lock - > fl ;
const struct nfs_fh * fh = & lock - > fh ;
2005-04-17 02:20:36 +04:00
struct nlm_wait * block ;
2006-10-20 10:28:46 +04:00
__be32 res = nlm_lck_denied ;
2005-04-17 02:20:36 +04:00
/*
* Look up blocked request based on arguments .
* Warning : must not use cookie to match it !
*/
2010-09-22 17:50:35 +04:00
spin_lock ( & nlm_blocked_lock ) ;
2005-06-22 21:16:31 +04:00
list_for_each_entry ( block , & nlm_blocked , b_list ) {
2006-02-15 00:53:04 +03:00
struct file_lock * fl_blocked = block - > b_lock ;
2006-03-20 21:44:06 +03:00
if ( fl_blocked - > fl_start ! = fl - > fl_start )
continue ;
if ( fl_blocked - > fl_end ! = fl - > fl_end )
continue ;
/*
* Careful ! The NLM server will return the 32 - bit " pid " that
* we put on the wire : in this case the lockowner " pid " .
*/
if ( fl_blocked - > fl_u . nfs_fl . owner - > pid ! = lock - > svid )
2006-02-15 00:53:04 +03:00
continue ;
2009-08-14 20:57:54 +04:00
if ( ! rpc_cmp_addr ( nlm_addr ( block - > b_host ) , addr ) )
2006-02-15 00:53:04 +03:00
continue ;
2006-12-08 13:37:18 +03:00
if ( nfs_compare_fh ( NFS_FH ( fl_blocked - > fl_file - > f_path . dentry - > d_inode ) , fh ) ! = 0 )
2006-02-15 00:53:04 +03:00
continue ;
/* Alright, we found a lock. Set the return status
* and wake up the caller
*/
2006-12-13 11:35:03 +03:00
block - > b_status = nlm_granted ;
2006-02-15 00:53:04 +03:00
wake_up ( & block - > b_wait ) ;
res = nlm_granted ;
2005-04-17 02:20:36 +04:00
}
2010-09-22 17:50:35 +04:00
spin_unlock ( & nlm_blocked_lock ) ;
2005-06-22 21:16:31 +04:00
return res ;
2005-04-17 02:20:36 +04:00
}
/*
* The following procedures deal with the recovery of locks after a
* server crash .
*/
/*
* Reclaim all locks on server host . We do this by spawning a separate
* reclaimer thread .
*/
void
2006-10-04 13:15:55 +04:00
nlmclnt_recovery ( struct nlm_host * host )
2005-04-17 02:20:36 +04:00
{
2008-12-23 23:21:33 +03:00
struct task_struct * task ;
2006-06-09 17:40:27 +04:00
if ( ! host - > h_reclaiming + + ) {
2005-04-17 02:20:36 +04:00
nlm_get_host ( host ) ;
2008-12-23 23:21:33 +03:00
task = kthread_run ( reclaimer , host , " %s-reclaim " , host - > h_name ) ;
if ( IS_ERR ( task ) )
printk ( KERN_ERR " lockd: unable to spawn reclaimer "
" thread. Locks for %s won't be reclaimed! "
" (%ld) \n " , host - > h_name , PTR_ERR ( task ) ) ;
2005-04-17 02:20:36 +04:00
}
}
static int
reclaimer ( void * ptr )
{
struct nlm_host * host = ( struct nlm_host * ) ptr ;
struct nlm_wait * block ;
2006-03-20 21:44:40 +03:00
struct file_lock * fl , * next ;
2006-06-09 17:40:27 +04:00
u32 nsmstate ;
2005-04-17 02:20:36 +04:00
allow_signal ( SIGKILL ) ;
2006-10-04 13:15:55 +04:00
down_write ( & host - > h_rwsem ) ;
2008-10-04 01:15:30 +04:00
lockd_up ( ) ; /* note: this cannot fail as lockd is already running */
2005-04-17 02:20:36 +04:00
2007-01-30 00:19:51 +03:00
dprintk ( " lockd: reclaiming locks for host %s \n " , host - > h_name ) ;
2006-10-04 13:15:55 +04:00
2005-04-17 02:20:36 +04:00
restart :
2006-06-09 17:40:27 +04:00
nsmstate = host - > h_nsmstate ;
2006-10-04 13:15:55 +04:00
/* Force a portmap getport - the peer's lockd will
* most likely end up on a different port .
*/
2006-10-04 13:16:04 +04:00
host - > h_nextrebind = jiffies ;
2006-10-04 13:15:55 +04:00
nlm_rebind_host ( host ) ;
/* First, reclaim all locks that have been granted. */
list_splice_init ( & host - > h_granted , & host - > h_reclaim ) ;
2006-03-20 21:44:40 +03:00
list_for_each_entry_safe ( fl , next , & host - > h_reclaim , fl_u . nfs_fl . list ) {
2006-03-20 21:44:41 +03:00
list_del_init ( & fl - > fl_u . nfs_fl . list ) ;
2005-04-17 02:20:36 +04:00
2008-12-23 23:21:33 +03:00
/*
* sending this thread a SIGKILL will result in any unreclaimed
* locks being removed from the h_granted list . This means that
* the kernel will not attempt to reclaim them again if a new
* reclaimer thread is spawned for this host .
*/
2005-04-17 02:20:36 +04:00
if ( signalled ( ) )
2006-03-20 21:44:41 +03:00
continue ;
2006-06-09 17:40:27 +04:00
if ( nlmclnt_reclaim ( host , fl ) ! = 0 )
continue ;
list_add_tail ( & fl - > fl_u . nfs_fl . list , & host - > h_granted ) ;
if ( host - > h_nsmstate ! = nsmstate ) {
/* Argh! The server rebooted again! */
goto restart ;
}
2005-04-17 02:20:36 +04:00
}
2006-10-04 13:15:55 +04:00
host - > h_reclaiming = 0 ;
up_write ( & host - > h_rwsem ) ;
2007-01-30 00:19:51 +03:00
dprintk ( " NLM: done reclaiming locks for host %s \n " , host - > h_name ) ;
2005-04-17 02:20:36 +04:00
/* Now, wake up all processes that sleep on a blocked lock */
2010-09-22 17:50:35 +04:00
spin_lock ( & nlm_blocked_lock ) ;
2005-06-22 21:16:31 +04:00
list_for_each_entry ( block , & nlm_blocked , b_list ) {
2005-04-17 02:20:36 +04:00
if ( block - > b_host = = host ) {
2006-12-13 11:35:03 +03:00
block - > b_status = nlm_lck_denied_grace_period ;
2005-04-17 02:20:36 +04:00
wake_up ( & block - > b_wait ) ;
}
}
2010-09-22 17:50:35 +04:00
spin_unlock ( & nlm_blocked_lock ) ;
2005-04-17 02:20:36 +04:00
/* Release host handle after use */
lockd: Create client-side nlm_host cache
NFS clients don't need the garbage collection processing that is
performed on nlm_host structures. The client picks up an nlm_host at
mount time and holds a reference to it until the file system is
unmounted.
Servers, on the other hand, don't have a precise way to tell when an
nlm_host is no longer being used, so zero refcount nlm_host entries
are left to expire in the cache after a time.
Basically there's nothing holding a reference to an nlm_host between
individual server-side NLM requests, but we can't afford the expense
of recreating them for every new NLM request from a client. The
nlm_host cache adds some lifetime hysteresis to entries in the cache
so the next time a particular nlm_host is needed, it's likely to be
discovered by a lookup rather than created from whole cloth.
With the new implementation, client nlm_host cache items are no longer
garbage collected, and are destroyed directly by a new release
function specialized for client entries, nlmclnt_release_host(). They
are cached in their own data structure, and have their own lookup
logic, simplified and specialized for client nlm_host entries.
However, the client nlm_host cache still shares reboot recovery logic
with the server nlm_host cache. The NSM "peer rebooted" downcall for
clients and servers still come through the same RPC call. This is a
legacy formal API that would be difficult to alter, and besides, the
user space NSM implementation can't tell the difference between peers
that are clients or servers.
For this reason, the client cache continues to share the
nlm_host_mutex (and reboot recovery logic) with the server cache.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-14 18:05:52 +03:00
nlmclnt_release_host ( host ) ;
2005-04-17 02:20:36 +04:00
lockd_down ( ) ;
2008-12-23 23:21:33 +03:00
return 0 ;
2005-04-17 02:20:36 +04:00
}