linux

iv/linux

Author	SHA1	Message	Date
Jeff Layton	0486958f57	nfs: move nfs_file_operations declaration to bottom of file.c (try #2 ) ...a remove a set of forward declarations. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-04 16:39:11 -04:00
Jeff Layton	1788ea6e3b	nfs: when attempting to open a directory, fall back on normal lookup (try #5 ) commit d953126 changed how nfs_atomic_lookup handles an -EISDIR return from an OPEN call. Prior to that patch, that caused the client to fall back to doing a normal lookup. When that patch went in, the code began returning that error to userspace. The d_revalidate codepath however never had the corresponding change, so it was still possible to end up with a NULL ctx->state pointer after that. That patch caused a regression. When we attempt to open a directory that does not have a cached dentry, that open now errors out with EISDIR. If you attempt the same open with a cached dentry, it will succeed. Fix this by reverting the change in nfs_atomic_lookup and allowing attempts to open directories to fall back to a normal lookup Also, add a NFSv4-specific f_ops->open routine that just returns -ENOTDIR. This should never be called if things are working properly, but if it ever is, then the dprintk may help in debugging. To facilitate this, a new file_operations field is also added to the nfs_rpc_ops struct. Cc: stable@kernel.org Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-04 16:39:04 -04:00
Linus Torvalds	6736c04799	Merge branch 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs * 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (25 commits) nfs: set vs_hidden on nfs4_callback_version4 (try #2) pnfs-obj: Support for RAID5 read-4-write interface. pnfs-obj: move to ore 03: Remove old raid engine pnfs-obj: move to ore 02: move to ORE pnfs-obj: move to ore 01: ore_layout & ore_components pnfs-obj: Rename objlayout_io_state => objlayout_io_res pnfs-obj: Get rid of objlayout_{alloc,free}_io_state pnfs-obj: Return PNFS_NOT_ATTEMPTED in case of read/write_pagelist pnfs-obj: Remove redundant EOF from objlayout_io_state nfs: Remove unused variable from write.c nfs: Fix unused variable warning from file.c NFS: Remove no-op less-than-zero checks on unsigned variables. NFS: Clean up nfs4_xdr_dec_secinfo() NFS: Fix documenting comment for nfs_create_request() NFS4: fix cb_recallany decode error nfs4: serialize layoutcommit SUNRPC: remove rpcbind clients destruction on module cleanup SUNRPC: remove rpcbind clients creation during service registering NFSd: call svc rpcbind cleanup explicitly SUNRPC: cleanup service destruction ...	2011-11-04 12:27:43 -07:00
Jeff Layton	6070295efc	nfs: set vs_hidden on nfs4_callback_version4 (try #2 ) This service should not be registered with or unregistered from rpcbind. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-04 09:00:09 -04:00
Trond Myklebust	31cbecb4ab	Merge branch 'osd-devel' into nfs-for-next	2011-11-02 23:56:40 -04:00
Boaz Harrosh	278c023a99	pnfs-obj: Support for RAID5 read-4-write interface. The ore need suplied a r4w_get_page/r4w_put_page API from Filesystem so it can get cache pages to read-into when writing parial stripes. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:56:09 -04:00
Boaz Harrosh	04291b628c	pnfs-obj: move to ore 03: Remove old raid engine Finally remove all the old raid engine, which is by now dead code. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:56:08 -04:00
Boaz Harrosh	eecfc6312a	pnfs-obj: move to ore 02: move to ORE In this patch we are actually moving to the ORE. (Object Raid Engine). objio_state holds a pointer to an ore_io_state. Once we have an ore_io_state at hand we can call the ore for reading/writing. We register on the done path to kick off the nfs io_done mechanism. Again for Ease of reviewing the old code is "#if 0" but is not removed so the diff command works better. The old code will be removed in the next patch. fs/exofs/Kconfig::ORE is modified to also be auto-included if PNFS_OBJLAYOUT is set. Since we now depend on ORE. (See comments in fs/exofs/Kconfig) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:56:08 -04:00
Boaz Harrosh	af4f5b54bc	pnfs-obj: move to ore 01: ore_layout & ore_components For Ease of reviewing I split the move to ore into 3 parts move to ore 01: ore_layout & ore_components move to ore 02: move to ORE move to ore 03: Remove old raid engine This patch modifies the objio_lseg, layout-segment level and devices and components arrays to use the ORE types. Though it will be removed soon, also the raid engine is modified to actually compile, possibly run, with the new types. So it is the same old raid engine but with some new ORE types. For Ease of reviewing, some of the old code is "#if 0" but is not removed so the diff command works better. The old code will be removed in the 3rd patch. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:56:07 -04:00
Boaz Harrosh	e2e04355d9	pnfs-obj: Rename objlayout_io_state => objlayout_io_res * All instances of objlayout_io_state => objlayout_io_res * All instances of state => oir; * All instances of ol_state => oir; Big but nothing to it Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:56:06 -04:00
Boaz Harrosh	96218556b0	pnfs-obj: Get rid of objlayout_{alloc,free}_io_state This is part of moving objio_osd to use the ORE. objlayout_io_state had two functions: 1. It was used in the error reporting mechanism at layout_return. This function is kept intact. (Later patch will rename objlayout_io_state => objlayout_io_res) 2. Carrier of rw io members into the objio_read/write_paglist API. This is removed in this patch. The {r,w}data received from NFS are passed directly to the objio_{read,write}_paglist API. The io_engine is now allocating it's own IO state as part of the read/write. The minimal functionality that was part of the generic allocation is passed to the io_engine. So part of this patch is rename of: ios->ol_state.foo => ios->foo At objlayout_{read,write}_done an objlayout_io_state is passed that denotes the result of the IO. (Hence the later name change). If the IO is successful objlayout calls an objio_free_result() API immediately (Which for objio_osd causes the release of the io_state). If the IO ended in an error it is hanged onto until reported in layout_return and is released later through the objio_free_result() API. (All this is not new just renamed and cleaned) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:56:05 -04:00
Boaz Harrosh	e6c40fe3f4	pnfs-obj: Return PNFS_NOT_ATTEMPTED in case of read/write_pagelist objlayout driver was always returning PNFS_ATTEMPTED from it's read/write_pagelist operations. Even on error. Fix that. Start by establishing an error return API from io-engine, by not returning ssize_t (length-or-error) but returning "int" 0=OK, 0>Error. And clean up all return types in io-engine. Then if io-engine returned error return PNFS_NOT_ATTEMPTED to generic layer. (With a dprint) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:56:03 -04:00
Boaz Harrosh	4cdc685c7d	pnfs-obj: Remove redundant EOF from objlayout_io_state The EOF calculation was done on .read_pagelist(), cached in objlayout_io_state->eof, and set in objlayout_read_done() into nfs_read_data->res.eof. So set it directly into nfs_read_data->res.eof and avoid the extra member. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:56:00 -04:00
Rakib Mullick	2b72c9ccd2	nfs: Remove unused variable from write.c When CONFIG_NFS=y and CONFIG_NFS_V3_{,V4}=n we get the following warning. fs/nfs/write.c: In function ‘nfs_writeback_done’: fs/nfs/write.c:1246:21: warning: unused variable ‘server’ Remove the variable 'server' to fix the above warning. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:50:24 -04:00
Rakib Mullick	6f276e49fd	nfs: Fix unused variable warning from file.c Fix the following unused variable warning. fs/nfs/file.c: In function ‘nfs_file_release’: fs/nfs/file.c:140:17: warning: unused variable ‘dentry’ fs/nfs/file.c: In function ‘nfs_file_read’: fs/nfs/file.c:237:9: warning: unused variable ‘count’ Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-11-02 23:49:09 -04:00
Miklos Szeredi	bfe8684869	filesystems: add set_nlink() Replace remaining direct i_nlink updates with a new set_nlink() updater function. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-11-02 12:53:43 +01:00
Miklos Szeredi	6d6b77f163	filesystems: add missing nlink wrappers Replace direct i_nlink updates with the respective updater function (inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-11-02 12:53:43 +01:00
Chuck Lever	e414966b81	NFS: Remove no-op less-than-zero checks on unsigned variables. Introduced by commit 16b374ca "NFSv4.1: pnfs: filelayout: add driver's LAYOUTGET and GETDEVICEINFO infrastructure" (October 20, 2010). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:52:47 -04:00
Chuck Lever	c6e6966602	NFS: Clean up nfs4_xdr_dec_secinfo() Clean up: Remove superfluous logic at the tail of nfs4_xdr_dec_secinfo() . Introduced by commit 5a5ea0d4 "NFS: Add secinfo procedure" (March 24, 2011). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:52:47 -04:00
Chuck Lever	c02f557dd0	NFS: Fix documenting comment for nfs_create_request() Clean up: the first parameter of nfs_create_request() has been incorrectly documented since time immemorial (OK, since before 2.6.12). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:52:47 -04:00
Peng Tao	d743c3c9c2	NFS4: fix cb_recallany decode error craa_type_mask is bitmap4 per RFC5661. We need to expect a length before extracting bitmap value. Cc: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:51:28 -04:00
Peng Tao	92407e75ce	nfs4: serialize layoutcommit Current pnfs_layoutcommit_inode can not handle parallel layoutcommit. And as Trond suggested , there is no need for client to optimize for parallel layoutcommit. So add NFS_INO_LAYOUTCOMMITTING flag to mark inflight layoutcommit and serialize lalyoutcommit with it. Also mark_inode_dirty_sync if pnfs_layoutcommit_inode fails to issue layoutcommit. Reported-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:51:28 -04:00
Linus Torvalds	f362f98e7c	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits) leases: fix write-open/read-lease race nfs: drop unnecessary locking in llseek ext4: replace cut'n'pasted llseek code with generic_file_llseek_size vfs: add generic_file_llseek_size vfs: do (nearly) lockless generic_file_llseek direct-io: merge direct_io_walker into __blockdev_direct_IO direct-io: inline the complete submission path direct-io: separate map_bh from dio direct-io: use a slab cache for struct dio direct-io: rearrange fields in dio/dio_submit to avoid holes direct-io: fix a wrong comment direct-io: separate fields only used in the submission path from struct dio vfs: fix spinning prevention in prune_icache_sb vfs: add a comment to inode_permission() vfs: pass all mask flags check_acl and posix_acl_permission vfs: add hex format for MAY_* flag values vfs: indicate that the permission functions take all the MAY_* flags compat: sync compat_stats with statfs. vfs: add "device" tag to /proc/self/mountstats cleanup: vfs: small comment fix for block_invalidatepage ... Fix up trivial conflict in fs/gfs2/file.c (llseek changes)	2011-10-28 10:49:34 -07:00
Andi Kleen	79835a710d	nfs: drop unnecessary locking in llseek This makes NFS follow the standard generic_file_llseek locking scheme. Cc: Trond.Myklebust@netapp.com Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:59:00 +02:00
Andi Kleen	ef3d0fd27e	vfs: do (nearly) lockless generic_file_llseek The i_mutex lock use of generic _file_llseek hurts. Independent processes accessing the same file synchronize over a single lock, even though they have no need for synchronization at all. Under high utilization this can cause llseek to scale very poorly on larger systems. This patch does some rethinking of the llseek locking model: First the 64bit f_pos is not necessarily atomic without locks on 32bit systems. This can already cause races with read() today. This was discussed on linux-kernel in the past and deemed acceptable. The patch does not change that. Let's look at the different seek variants: SEEK_SET: Doesn't really need any locking. If there's a race one writer wins, the other loses. For 32bit the non atomic update races against read() stay the same. Without a lock they can also happen against write() now. The read() race was deemed acceptable in past discussions, and I think if it's ok for read it's ok for write too. => Don't need a lock. SEEK_END: This behaves like SEEK_SET plus it reads the maximum size too. Reading the maximum size would have the 32bit atomic problem. But luckily we already have a way to read the maximum size without locking (i_size_read), so we can just use that instead. Without i_mutex there is no synchronization with write() anymore, however since the write() update is atomic on 64bit it just behaves like another racy SEEK_SET. On non atomic 32bit it's the same as SEEK_SET. => Don't need a lock, but need to use i_size_read() SEEK_CUR: This has a read-modify-write race window on the same file. One could argue that any application doing unsynchronized seeks on the same file is already broken. But for the sake of not adding a regression here I'm using the file->f_lock to synchronize this. Using this lock is much better than the inode mutex because it doesn't synchronize between processes. => So still need a lock, but can use a f_lock. This patch implements this new scheme in generic_file_llseek. I dropped generic_file_llseek_unlocked and changed all callers. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:58 +02:00
Linus Torvalds	ef78cc75f1	Merge branch 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs * 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (26 commits) Check validity of cl_rpcclient in nfs_server_list_show NFS: Get rid of the nfs_rdata_mempool NFS: Don't rely on PageError in nfs_readpage_release_partial NFS: Get rid of unnecessary calls to ClearPageError() in read code NFS: Get rid of nfs_restart_rpc() NFS: Get rid of the unused nfs_write_data->flags field NFS: Get rid of the unused nfs_read_data->flags field NFSv4: Translate NFS4ERR_BADNAME into ENOENT when applied to a lookup NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup() NFS: Use the inode->i_version to cache NFSv4 change attribute information SUNRPC: Remove unnecessary export of rpc_sockaddr2uaddr SUNRPC: Fix rpc_sockaddr2uaddr nfs/super.c: local functions should be static pnfsblock: fix writeback deadlock pnfsblock: fix NULL pointer dereference pnfs: recoalesce when ld read pagelist fails pnfs: recoalesce when ld write pagelist fails pnfs: make _set_lo_fail generic pnfsblock: add missing rpc_put_mount and path_put SUNRPC/NFS: make rpc pipe upcall generic ...	2011-10-25 15:44:06 +02:00
Linus Torvalds	1442d1678c	Merge branch 'for-3.2' of git://linux-nfs.org/~bfields/linux * 'for-3.2' of git://linux-nfs.org/~bfields/linux: (103 commits) nfs41: implement DESTROY_CLIENTID operation nfsd4: typo logical vs bitwise negate for want_mask nfsd4: allow NFS4_SHARE_SIGNAL_DELEG_WHEN_RESRC_AVAIL \| NFS4_SHARE_PUSH_DELEG_WHEN_UNCONTENDED nfsd4: seq->status_flags may be used unitialized nfsd41: use SEQ4_STATUS_BACKCHANNEL_FAULT when cb_sequence is invalid nfsd4: implement new 4.1 open reclaim types nfsd4: remove unneeded CLAIM_DELEGATE_CUR workaround nfsd4: warn on open failure after create nfsd4: preallocate open stateid in process_open1() nfsd4: do idr preallocation with stateid allocation nfsd4: preallocate nfs4_file in process_open1() nfsd4: clean up open owners on OPEN failure nfsd4: simplify process_open1 logic nfsd4: make is_open_owner boolean nfsd4: centralize renew_client() calls nfsd4: typo logical vs bitwise negate nfs: fix bug about IPv6 address scope checking nfsd4: more robust ignoring of WANT bits in OPEN nfsd4: move name-length checks to xdr nfsd4: move access/deny validity checks to xdr code ...	2011-10-25 15:42:01 +02:00
Malahal Naineni	940aab4902	Check validity of cl_rpcclient in nfs_server_list_show As soon as the nfs_client gets created, its cl_rpcclient is set to ERR_PTR(-EINVAL). The rpc client structure is allocated later. Check if the client is ready before using the cl_rpcclient pointer. Signed-off-by: Malahal Naineni <malahal@us.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-20 18:44:04 -05:00
Trond Myklebust	b6ee8cd264	NFS: Get rid of the nfs_rdata_mempool We don't need a mempool in order to guarantee reliable NFS read performance. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-19 13:58:38 -07:00
Trond Myklebust	fba730050d	NFS: Don't rely on PageError in nfs_readpage_release_partial Don't rely on the PageError flag to tell us if one of the partial reads of the page failed. Instead, replace that with a dedicated flag in the struct nfs_page. Then clean out redundant uses of the PageError flag: the VM no longer checks it for reads. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-19 13:58:38 -07:00
Trond Myklebust	fbb5a9abf0	NFS: Get rid of unnecessary calls to ClearPageError() in read code The generic file read code does that for us anyway. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-19 13:58:37 -07:00
Trond Myklebust	d00c5d4386	NFS: Get rid of nfs_restart_rpc() It can trivially be replaced with rpc_restart_call_prepare. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-19 13:58:30 -07:00
Trond Myklebust	08ef7bd3bc	NFSv4: Translate NFS4ERR_BADNAME into ENOENT when applied to a lookup Both LOOKUP and OPEN operations may return NFS4ERR_BADNAME if we send a an invalid name as a filename argument. As far as the application is concerned, it just has to know that the file doesn't exist, and so ENOENT would be the appropriate reply. We should only return EINVAL if the filename is being used to _create_ a new object on the remote filesystem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 16:13:51 -07:00
Trond Myklebust	0c2e53f11a	NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup() ...and also remove the associated nfs_v4_clientops entry. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 16:13:51 -07:00
Trond Myklebust	a9a4a87a59	NFS: Use the inode->i_version to cache NFSv4 change attribute information Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:14:34 -07:00
H Hartley Sweeten	45402c38ee	nfs/super.c: local functions should be static commit ae50c0b5 "pnfs: client stats" added additional information to the output of /proc/self/mountstats. The new functions introduced are only used in this file and should be marked static. If CONFIG_NFS_V4_1 is not defined, empty stub functions are used. If CONFIG_NFS_V4 is not defined these stub functions are not used at all. Adding static for the functions results in compile warnings: fs/nfs/super.c:743: warning: 'show_sessions' defined but not used fs/nfs/super.c:756: warning: 'show_pnfs' defined but not used Fix this by adding a #ifdef CONFIG_NFS_V4 guard around the two show_ functions. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:15 -07:00
Peng Tao	7542274519	pnfsblock: fix writeback deadlock We should check if the sector is already initialized before trying to grab the page from page cache. Otherwise when two pages of the same block are written back by two threads each calling from writepage_locked, it can cause deadlock like bellow. [ 1080.972099] INFO: task kswapd0:25 blocked for more than 120 seconds. [ 1080.972377] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1080.972812] kswapd0 D ffff88000c4926c0 0 25 2 0x00000000 [ 1080.972816] ffff88000df276b0 0000000000000046 ffff88000df27640 ffffffff81013ba7 [ 1080.972821] ffff88000c492310 ffff88000df27fd8 ffff88000df27fd8 00000000001d3440 [ 1080.972824] ffff88000c378000 ffff88000c492310 ffff8800175d3d40 ffff880017fc75a8 [ 1080.972828] Call Trace: [ 1080.972860] [<ffffffff81013ba7>] ? read_tsc+0x9/0x19 [ 1080.972877] [<ffffffff810e0b23>] ? lock_page+0x2b/0x2b [ 1080.972899] [<ffffffff81475a1d>] io_schedule+0x63/0x7e [ 1080.972902] [<ffffffff810e0b31>] sleep_on_page+0xe/0x12 [ 1080.972905] [<ffffffff81475fe8>] __wait_on_bit_lock+0x46/0x8f [ 1080.972916] [<ffffffff810822d7>] ? lock_release_holdtime.part.7+0x6b/0x72 [ 1080.972919] [<ffffffff810e0af6>] __lock_page+0x66/0x68 [ 1080.972928] [<ffffffff81072705>] ? autoremove_wake_function+0x3d/0x3d [ 1080.972932] [<ffffffff810e0b1f>] lock_page+0x27/0x2b [ 1080.972934] [<ffffffff810e0bcf>] find_lock_page+0x34/0x57 [ 1080.972937] [<ffffffff810e1738>] find_or_create_page+0x34/0x8a [ 1080.972947] [<ffffffffa034245b>] bl_write_pagelist+0x205/0x6da [blocklayoutdriver] [ 1080.972951] [<ffffffffa034145d>] ? bl_free_lseg+0x38/0x38 [blocklayoutdriver] [ 1080.972995] [<ffffffffa02e27b9>] ? nfs_write_rpcsetup+0x118/0x123 [nfs] [ 1080.973033] [<ffffffffa030246b>] pnfs_generic_pg_writepages+0x10b/0x1f4 [nfs] [ 1080.973089] [<ffffffffa02deaae>] nfs_pageio_doio+0x1a/0x43 [nfs] [ 1080.973098] [<ffffffffa02df035>] nfs_pageio_complete+0x16/0x2d [nfs] [ 1080.973108] [<ffffffffa02e2d8f>] nfs_writepage_locked+0xa0/0xbf [nfs] [ 1080.973119] [<ffffffffa02e36a1>] nfs_writepage+0x16/0x2b [nfs] [ 1080.973122] [<ffffffff810e8762>] ? clear_page_dirty_for_io+0x87/0x9a [ 1080.973133] [<ffffffff810efc5b>] shrink_page_list+0x39b/0x6c8 [ 1080.973139] [<ffffffff810f03bb>] shrink_inactive_list+0x22c/0x39e [ 1080.973144] [<ffffffff810822d7>] ? lock_release_holdtime.part.7+0x6b/0x72 [ 1080.973148] [<ffffffff810f0c33>] shrink_zone+0x445/0x588 [ 1080.973152] [<ffffffff810f1a11>] balance_pgdat+0x2c2/0x56b [ 1080.973170] [<ffffffff81254208>] ? __bitmap_weight+0x34/0x80 [ 1080.973175] [<ffffffff810f1f78>] kswapd+0x2be/0x2fa [ 1080.973179] [<ffffffff810726c8>] ? __init_waitqueue_head+0x4b/0x4b [ 1080.973183] [<ffffffff810f1cba>] ? balance_pgdat+0x56b/0x56b [ 1080.973187] [<ffffffff81071f69>] kthread+0xa8/0xb0 [ 1080.973200] [<ffffffff814806b4>] kernel_thread_helper+0x4/0x10 [ 1080.973205] [<ffffffff81071ec1>] ? __init_kthread_worker+0x5a/0x5a [ 1080.973210] [<ffffffff814806b0>] ? gs_change+0x13/0x13 [ 1080.973213] no locks held by kswapd0/25. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:15 -07:00
Peng Tao	e6d05a757c	pnfsblock: fix NULL pointer dereference bl_add_page_to_bio returns error pointer. bio should be reset to NULL in failure cases as the out path always calls bl_submit_bio. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:14 -07:00
Peng Tao	9b7eecdcfe	pnfs: recoalesce when ld read pagelist fails For pnfs pagelist read failure, we need to pg_recoalesce and resend IO to mds. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:14 -07:00
Peng Tao	8ce160c5ef	pnfs: recoalesce when ld write pagelist fails For pnfs pagelist write failure, we need to pg_recoalesce and resend IO to mds. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:13 -07:00
Peng Tao	1b0ae06877	pnfs: make _set_lo_fail generic file layout and block layout both use it to set mark layout io failure bit. So make it generic. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:13 -07:00
Peng Tao	760383f1ee	pnfsblock: add missing rpc_put_mount and path_put Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:12 -07:00
Peng Tao	c1225158a8	SUNRPC/NFS: make rpc pipe upcall generic The same function is used by idmap, gss and blocklayout code. Make it generic. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:12 -07:00
Jim Rees	fdc17abbc4	pnfsblock: fix size of upcall message Make the status field explicitly 32 bits. "...it's unlikely that the kernel and userspace would differ on the size of an int here, but it might be a good idea to go ahead and make that explicitly 32 bits in case we end up dealing with more exotic arches at some point in the future." Suggested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:11 -07:00
Jim Rees	516f2e24fa	pnfsblock: fix return code confusion Always return PTR_ERR, not NULL, from nfs4_blk_get_deviceinfo and nfs4_blk_decode_device. Check for IS_ERR, not NULL, in bl_set_layoutdriver when calling nfs4_blk_get_deviceinfo. Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:11 -07:00
Jeff Layton	2da9565235	nfs: don't try to migrate pages with active requests nfs_find_and_lock_request will take a reference to the nfs_page and will then put it if the req is already locked. It's possible though that the reference will be the last one. That put then can kick off a whole series of reference puts: nfs_page nfs_open_context dentry inode If the inode ends up being deleted, then the VFS will call truncate_inode_pages. That function will try to take the page lock, but it was already locked when migrate_page was called. The code deadlocks. Fix this by simply refusing the migration request if PagePrivate is already set, indicating that the page is already associated with an active read or write request. We've had a customer test a backported version of this patch and the preliminary results seem good. Cc: stable@kernel.org Cc: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:11 -07:00
Mi Jinlong	b9dd3abbbc	nfs: fix bug about IPv6 address scope checking The result from ipv6_addr_scope() always not be a single SCOPE, so we can't use equal to compare the result with IPV6_ADDR_SCOPE_LINKLOCAL at nfs_sockaddr_match_ipaddr6. This patch fixs the problem, and lets checking address before scope_id. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:10 -07:00
Jeff Layton	3236c3e1ad	nfs: don't redirty inode when ncommit == 0 in nfs_commit_unstable_pages commit 420e3646 allowed the kernel to reduce the number of unnecessary commit calls by skipping the commit when there are a large number of outstanding pages. However, the current test in nfs_commit_unstable_pages does not handle the edge condition properly. When ncommit == 0, then that means that the kernel doesn't need to do anything more for the inode. The current test though in the WB_SYNC_NONE case will return true, and the inode will end up being marked dirty. Once that happens the inode will never be clean until there's a WB_SYNC_ALL flush. Fix this by immediately returning from nfs_commit_unstable_pages when ncommit == 0. Mike noticed this problem initially in RHEL5 (2.6.18-based kernel) which has a backported version of 420e3646. The inode cache there was growing very large. The inode cache was unable to be shrunk since the inodes were all marked dirty. Calling sync() would essentially "fix" the problem -- the WB_SYNC_ALL flush would result in the inodes all being marked clean. What I'm not clear on is how big a problem this is in mainline kernels as the writeback code there is very different. Either way, it seems incorrect to re-mark the inode dirty in this case. Reported-by: Mike McLean <mikem@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: stable@kernel.org [2.6.34+] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:10 -07:00
Trond Myklebust	59b7c05fff	Revert "NFS: Ensure that writeback_single_inode() calls write_inode() when syncing" This reverts commit b80c3cb628f0ebc241b02e38dd028969fb8026a2. The reverted commit was rendered obsolete by a VFS fix: commit 5547e8aac6f71505d621a612de2fca0dd988b439 (writeback: Update dirty flags in two steps). We now no longer need to worry about writeback_single_inode() missing our marking the inode for COMMIT in 'do_writepages()' call. Reverting this patch, fixes a performance regression in which the inode would continuously get queued to the dirty list, causing the writeback code to unnecessarily try to send a COMMIT. Signed-off-by: Trond Myklebust <Trond.Myklebust> Tested-by: Simon Kirby <sim@hostway.ca> Cc: stable@kernel.org [2.6.35+]	2011-10-18 09:08:09 -07:00
Mi Jinlong	5703728ac1	nfs: fix bug about IPv6 address scope checking The result from ipv6_addr_scope() is a set of flags, not a single value, so we can't just compare the result with IPV6_ADDR_SCOPE_LINKLOCAL. This patch fixs the problem, and checks for unequal addresses before scope_id. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-12 10:30:29 -04:00

1 2 3 4 5 ...

2181 Commits