linux

iv/linux

Author	SHA1	Message	Date
Yongqiang Yang	5129d05fda	ext4: return ENOMEM if find_or_create_pages fails Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-31 17:56:10 -04:00
Yongqiang Yang	e260daf279	ext4: move vars to local scope in ext4_discard_partial_page_buffers_no_lock() Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-31 17:54:36 -04:00
Tao Ma	0edeb71dc9	ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten should be done simultaneously since ext4_end_io_nolock always clear the flag and decrease the counter in the same time. We have found some bugs that the flag is set while leaving i_aiodio_unwritten unchanged(commit 32c80b32c053d). So this patch just tries to create a helper function to wrap them to avoid any future bug. The idea is inspired by Eric. Cc: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-31 17:30:44 -04:00
Chuck Lever	e414966b81	NFS: Remove no-op less-than-zero checks on unsigned variables. Introduced by commit 16b374ca "NFSv4.1: pnfs: filelayout: add driver's LAYOUTGET and GETDEVICEINFO infrastructure" (October 20, 2010). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:52:47 -04:00
Chuck Lever	c6e6966602	NFS: Clean up nfs4_xdr_dec_secinfo() Clean up: Remove superfluous logic at the tail of nfs4_xdr_dec_secinfo() . Introduced by commit 5a5ea0d4 "NFS: Add secinfo procedure" (March 24, 2011). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:52:47 -04:00
Chuck Lever	c02f557dd0	NFS: Fix documenting comment for nfs_create_request() Clean up: the first parameter of nfs_create_request() has been incorrectly documented since time immemorial (OK, since before 2.6.12). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:52:47 -04:00
Peng Tao	d743c3c9c2	NFS4: fix cb_recallany decode error craa_type_mask is bitmap4 per RFC5661. We need to expect a length before extracting bitmap value. Cc: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:51:28 -04:00
Peng Tao	92407e75ce	nfs4: serialize layoutcommit Current pnfs_layoutcommit_inode can not handle parallel layoutcommit. And as Trond suggested , there is no need for client to optimize for parallel layoutcommit. So add NFS_INO_LAYOUTCOMMITTING flag to mark inflight layoutcommit and serialize lalyoutcommit with it. Also mark_inode_dirty_sync if pnfs_layoutcommit_inode fails to issue layoutcommit. Reported-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-31 11:51:28 -04:00
Theodore Ts'o	b82e384c7b	ext4: optimize locking for end_io extent conversion Now that we are doing the locking correctly, we need to grab the i_completed_io_lock() twice per end_io. We can clean this up by removing the structure from the i_complted_io_list, and use this as the locking mechanism to prevent ext4_flush_completed_IO() racing against ext4_end_io_work(), instead of clearing the EXT4_IO_END_UNWRITTEN in io->flag. In addition, if the ext4_convert_unwritten_extents() returns an error, we no longer keep the end_io structure on the linked list. This doesn't help, because it tends to lock up the file system and wedges the system. That's one way to call attention to the problem, but it doesn't help the overall robustness of the system. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-31 10:56:32 -04:00
Theodore Ts'o	4e29802121	ext4: remove unnecessary call to waitqueue_active() The usage of waitqueue_active() is not necessary, and introduces (I believe) a hard-to-hit race. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-30 18:41:19 -04:00
Tao Ma	d73d5046a7	ext4: Use correct locking for ext4_end_io_nolock() We must hold i_completed_io_lock when manipulating anything on the i_completed_io_list linked list. This includes io->lock, which we were checking in ext4_end_io_nolock(). So move this check to ext4_end_io_work(). This also has the bonus of avoiding extra work if it is already done without needing to take the mutex. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-30 18:26:08 -04:00
Curt Wohlgemuth	0e175a1835	writeback: Add a 'reason' to wb_writeback_work This creates a new 'reason' field in a wb_writeback_work structure, which unambiguously identifies who initiates writeback activity. A 'wb_reason' enumeration has been added to writeback.h, to enumerate the possible reasons. The 'writeback_work_class' and tracepoint event class and 'writeback_queue_io' tracepoints are updated to include the symbolic 'reason' in all trace events. And the 'writeback_inodes_sbXXX' family of routines has had a wb_stats parameter added to them, so callers can specify why writeback is being started. Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-31 00:33:36 +08:00
Curt Wohlgemuth	ad4e38dd6a	writeback: send work item to queue_io, move_expired_inodes Instead of sending ->older_than_this to queue_io() and move_expired_inodes(), send the entire wb_writeback_work structure. There are other fields of a work item that are useful in these routines and in tracepoints. Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-31 00:33:27 +08:00
Shirish Pargaonkar	9ef5992e44	cifs: Assume passwords are encoded according to iocharset (try #2 ) Re-posting a patch originally posted by Oskar Liljeblad after rebasing on 3.2. Modify cifs to assume that the supplied password is encoded according to iocharset. Before this patch passwords would be treated as raw 8-bit data, which made authentication with Unicode passwords impossible (at least passwords with characters > 0xFF). The previous code would as a side effect accept passwords encoded with ISO 8859-1, since Unicode < 0x100 basically is ISO 8859-1. Software which relies on that will no longer support password chars > 0x7F unless it also uses iocharset=iso8859-1. (mount.cifs does not care about the encoding so it will work as expected.) Signed-off-by: Oskar Liljeblad <oskar@osk.mine.nu> Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru> Tested-by: A <nimbus1_03087@yahoo.com> Signed-off-by: Steve French <smfrench@gmail.com>	2011-10-29 22:06:54 -05:00
Pavel Shilovsky	5079276066	CIFS: Fix the VFS brlock cache usage in posix locking case Request to the cache in FL_POSIX case only. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com>	2011-10-29 22:03:14 -05:00
Eric Sandeen	6d6a435190	ext4: fix race in xattr block allocation path Ceph users reported that when using Ceph on ext4, the filesystem would often become corrupted, containing inodes with incorrect i_blocks counters. I managed to reproduce this with a very hacked-up "streamtest" binary from the Ceph tree. Ceph is doing a lot of xattr writes, to out-of-inode blocks. There is also another thread which does sync_file_range and close, of the same files. The problem appears to happen due to this race: sync/flush thread xattr-set thread ----------------- ---------------- do_writepages ext4_xattr_set ext4_da_writepages ext4_xattr_set_handle mpage_da_map_blocks ext4_xattr_block_set set DELALLOC_RESERVE ext4_new_meta_blocks ext4_mb_new_blocks if (!i_delalloc_reserved_flag) vfs_dq_alloc_block ext4_get_blocks down_write(i_data_sem) set i_delalloc_reserved_flag ... up_write(i_data_sem) if (i_delalloc_reserved_flag) vfs_dq_alloc_block_nofail In other words, the sync/flush thread pops in and sets i_delalloc_reserved_flag on the inode, which makes the xattr thread think that it's in a delalloc path in ext4_new_meta_blocks(), and add the block for a second time, after already having added it once in the !i_delalloc_reserved_flag case in ext4_mb_new_blocks The real problem is that we shouldn't be using the DELALLOC_RESERVED state flag, and instead we should be passing EXT4_GET_BLOCKS_DELALLOC_RESERVE down to ext4_map_blocks() instead of using an inode state flag. We'll fix this for now with using i_data_sem to prevent this race, but this is really not the right way to fix things. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org	2011-10-29 10:15:35 -04:00
Yongqiang Yang	e7b319e397	ext4: trace punch_hole correctly in ext4_ext_map_blocks When ext4_ext_map_blocks() is called by punch_hole, trace should trace blocks punched out. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-29 09:39:51 -04:00
Yongqiang Yang	02dc62fba8	ext4: clean up AGGRESSIVE_TEST code Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-29 09:29:11 -04:00
Yongqiang Yang	81fdbb4a8d	ext4: move variables to their scope Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-29 09:23:38 -04:00
Dmitry Monakhov	5cb81dabcc	ext4: fix quota accounting during migration The tmp_inode should have same uid/gid as the original inode. Otherwise new metadata blocks will be accounted to wrong quota-id, which will result in a quota leak after the inode migration is completed. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-29 09:05:00 -04:00
Dmitry Monakhov	fba90ffee8	ext4: migrate cleanup This patch cleanup code a bit, actual logic not changed - Move current block pointer to migrate_structure, let's all walk info will be in one structure. - Get rid of usless null ind-block ptr checks, caller already does that check. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-29 09:03:00 -04:00
Linus Torvalds	97d2eb13a0	Merge branch 'for-linus' of git://ceph.newdream.net/git/ceph-client * 'for-linus' of git://ceph.newdream.net/git/ceph-client: libceph: fix double-free of page vector ceph: fix 32-bit ino numbers libceph: force resend of osd requests if we skip an osdmap ceph: use kernel DNS resolver ceph: fix ceph_monc_init memory leak ceph: let the set_layout ioctl set single traits Revert "ceph: don't truncate dirty pages in invalidate work thread" ceph: replace leading spaces with tabs libceph: warn on msg allocation failures libceph: don't complain on msgpool alloc failures libceph: always preallocate mon connection libceph: create messenger with client ceph: document ioctls ceph: implement (optional) max read size ceph: rename rsize -> rasize ceph: make readpages fully async	2011-10-28 16:42:18 -07:00
Steve French	8ea00c6977	[CIFS] Update cifs version to 1.76 Update cifs version to 1.76 now that async read, lock caching, and changes to oplock enabled interface are in. Thanks to Pavel for reminding me. Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com>	2011-10-28 14:49:46 -05:00
Pavel Shilovsky	d12799b4c3	CIFS: Remove extra mutex_unlock in cifs_lock_add_if to prevent the mutex being unlocked twice if we interrupt a blocked lock. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com>	2011-10-28 14:09:23 -05:00
Linus Torvalds	f362f98e7c	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits) leases: fix write-open/read-lease race nfs: drop unnecessary locking in llseek ext4: replace cut'n'pasted llseek code with generic_file_llseek_size vfs: add generic_file_llseek_size vfs: do (nearly) lockless generic_file_llseek direct-io: merge direct_io_walker into __blockdev_direct_IO direct-io: inline the complete submission path direct-io: separate map_bh from dio direct-io: use a slab cache for struct dio direct-io: rearrange fields in dio/dio_submit to avoid holes direct-io: fix a wrong comment direct-io: separate fields only used in the submission path from struct dio vfs: fix spinning prevention in prune_icache_sb vfs: add a comment to inode_permission() vfs: pass all mask flags check_acl and posix_acl_permission vfs: add hex format for MAY_* flag values vfs: indicate that the permission functions take all the MAY_* flags compat: sync compat_stats with statfs. vfs: add "device" tag to /proc/self/mountstats cleanup: vfs: small comment fix for block_invalidatepage ... Fix up trivial conflict in fs/gfs2/file.c (llseek changes)	2011-10-28 10:49:34 -07:00
Linus Torvalds	f793f29611	Merge http://sucs.org/~rohan/git/gfs2-3.0-nmw * http://sucs.org/~rohan/git/gfs2-3.0-nmw: (24 commits) GFS2: Move readahead of metadata during deallocation into its own function GFS2: Remove two unused variables GFS2: Misc fixes GFS2: rewrite fallocate code to write blocks directly GFS2: speed up delete/unlink performance for large files GFS2: Fix off-by-one in gfs2_blk2rgrpd GFS2: Clean up ->page_mkwrite GFS2: Correctly set goal block after allocation GFS2: Fix AIL flush issue during fsync GFS2: Use cached rgrp in gfs2_rlist_add() GFS2: Call do_strip() directly from recursive_scan() GFS2: Remove obsolete assert GFS2: Cache the most recently used resource group in the inode GFS2: Make resource groups "append only" during life of fs GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme GFS2: Fix lseek after SEEK_DATA, SEEK_HOLE have been added GFS2: Clean up gfs2_create GFS2: Use ->dirty_inode() GFS2: Fix bug trap and journaled data fsync GFS2: Fix inode allocation error path ...	2011-10-28 10:44:50 -07:00
Linus Torvalds	dabcbb1bae	Merge branch '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6 * '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6: (52 commits) Fix build break when freezer not configured Add definition for share encryption CIFS: Make cifs_push_locks send as many locks at once as possible CIFS: Send as many mandatory unlock ranges at once as possible CIFS: Implement caching mechanism for posix brlocks CIFS: Implement caching mechanism for mandatory brlocks CIFS: Fix DFS handling in cifs_get_file_info CIFS: Fix error handling in cifs_readv_complete [CIFS] Fixup trivial checkpatch warning [CIFS] Show nostrictsync and noperm mount options in /proc/mounts cifs, freezer: add wait_event_freezekillable and have cifs use it cifs: allow cifs_max_pending to be readable under /sys/module/cifs/parameters cifs: tune bdi.ra_pages in accordance with the rsize cifs: allow for larger rsize= options and change defaults cifs: convert cifs_readpages to use async reads cifs: add cifs_async_readv cifs: fix protocol definition for READ_RSP cifs: add a callback function to receive the rest of the frame cifs: break out 3rd receive phase into separate function cifs: find mid earlier in receive codepath ...	2011-10-28 10:43:32 -07:00
Linus Torvalds	5619a69396	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: (69 commits) xfs: add AIL pushing tracepoints xfs: put in missed fix for merge problem xfs: do not flush data workqueues in xfs_flush_buftarg xfs: remove XFS_bflush xfs: remove xfs_buf_target_name xfs: use xfs_ioerror_alert in xfs_buf_iodone_callbacks xfs: clean up xfs_ioerror_alert xfs: clean up buffer allocation xfs: remove buffers from the delwri list in xfs_buf_stale xfs: remove XFS_BUF_STALE and XFS_BUF_SUPER_STALE xfs: remove XFS_BUF_SET_VTYPE and XFS_BUF_SET_VTYPE_REF xfs: remove XFS_BUF_FINISH_IOWAIT xfs: remove xfs_get_buftarg_list xfs: fix buffer flushing during unmount xfs: optimize fsync on directories xfs: reduce the number of log forces from tail pushing xfs: Don't allocate new buffers on every call to _xfs_buf_find xfs: simplify xfs_trans_ijoin* again xfs: unlock the inode before log force in xfs_change_file_space xfs: unlock the inode before log force in xfs_fs_nfs_commit_metadata ...	2011-10-28 10:31:42 -07:00
J. Bruce Fields	f3c7691e8d	leases: fix write-open/read-lease race In setlease, we use i_writecount to decide whether we can give out a read lease. In open, we break leases before incrementing i_writecount. There is therefore a window between the break lease and the i_writecount increment when setlease could add a new read lease. This would leave us with a simultaneous write open and read lease, which shouldn't happen. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:59:00 +02:00
Andi Kleen	79835a710d	nfs: drop unnecessary locking in llseek This makes NFS follow the standard generic_file_llseek locking scheme. Cc: Trond.Myklebust@netapp.com Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:59:00 +02:00
Andi Kleen	4cce0e28b9	ext4: replace cut'n'pasted llseek code with generic_file_llseek_size This gives ext4 the benefits of unlocked llseek. Cc: tytso@mit.edu Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:59 +02:00
Andi Kleen	5760495a87	vfs: add generic_file_llseek_size Add a generic_file_llseek variant to the VFS that allows passing in the maximum file size of the file system, instead of always using maxbytes from the superblock. This can be used to eliminate some cut'n'paste seek code in ext4. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:59 +02:00
Andi Kleen	ef3d0fd27e	vfs: do (nearly) lockless generic_file_llseek The i_mutex lock use of generic _file_llseek hurts. Independent processes accessing the same file synchronize over a single lock, even though they have no need for synchronization at all. Under high utilization this can cause llseek to scale very poorly on larger systems. This patch does some rethinking of the llseek locking model: First the 64bit f_pos is not necessarily atomic without locks on 32bit systems. This can already cause races with read() today. This was discussed on linux-kernel in the past and deemed acceptable. The patch does not change that. Let's look at the different seek variants: SEEK_SET: Doesn't really need any locking. If there's a race one writer wins, the other loses. For 32bit the non atomic update races against read() stay the same. Without a lock they can also happen against write() now. The read() race was deemed acceptable in past discussions, and I think if it's ok for read it's ok for write too. => Don't need a lock. SEEK_END: This behaves like SEEK_SET plus it reads the maximum size too. Reading the maximum size would have the 32bit atomic problem. But luckily we already have a way to read the maximum size without locking (i_size_read), so we can just use that instead. Without i_mutex there is no synchronization with write() anymore, however since the write() update is atomic on 64bit it just behaves like another racy SEEK_SET. On non atomic 32bit it's the same as SEEK_SET. => Don't need a lock, but need to use i_size_read() SEEK_CUR: This has a read-modify-write race window on the same file. One could argue that any application doing unsynchronized seeks on the same file is already broken. But for the sake of not adding a regression here I'm using the file->f_lock to synchronize this. Using this lock is much better than the inode mutex because it doesn't synchronize between processes. => So still need a lock, but can use a f_lock. This patch implements this new scheme in generic_file_llseek. I dropped generic_file_llseek_unlocked and changed all callers. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:58 +02:00
Andi Kleen	847cc6371b	direct-io: merge direct_io_walker into __blockdev_direct_IO This doesn't change anything for the compiler, but hch thought it would make the code clearer. I moved the reference counting into its own little inline. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:58 +02:00
Andi Kleen	ba253fbf6d	direct-io: inline the complete submission path Add inlines to all the submission path functions. While this increases code size it also gives gcc a lot of optimization opportunities in this critical hotpath. In particular -- together with some other changes -- this allows gcc to get rid of the unnecessary clearing of sdio at the beginning and optimize the messy parameter passing. Any non inlining of a function which takes a sdio parameter would break this optimization because they cannot be done if the address of a structure is taken. Note that benefits are only seen with CONFIG_OPTIMIZE_INLINING and CONFIG_CC_OPTIMIZE_FOR_SIZE both set to off. This gives about 2.2% improvement on a large database benchmark with a high IOPS rate. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:58 +02:00
Andi Kleen	18772641db	direct-io: separate map_bh from dio Only a single b_private field in the map_bh buffer head is needed after the submission path. Move map_bh separately to avoid storing this information in the long term slab. This avoids the weird 104 byte hole in struct dio_submit which also needed to be memseted early. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:57 +02:00
Andi Kleen	6e8267f532	direct-io: use a slab cache for struct dio A direct slab call is slightly faster than kmalloc and can be better cached per CPU. It also avoids rounding to the next kmalloc slab. In addition this enforces cache line alignment for struct dio to avoid any false sharing. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:57 +02:00
Andi Kleen	0dc2bc49be	direct-io: rearrange fields in dio/dio_submit to avoid holes Fix most problems reported by pahole. There is still a weird 104 byte hole after map_bh. I'm not sure what causes this. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:56 +02:00
Andi Kleen	cde1ecb324	direct-io: fix a wrong comment There's nothing on the stack, even before my changes. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:56 +02:00
Andi Kleen	eb28be2b4c	direct-io: separate fields only used in the submission path from struct dio This large, but largely mechanic, patch moves all fields in struct dio that are only used in the submission path into a separate on stack data structure. This has the advantage that the memory is very likely cache hot, which is not guaranteed for memory fresh out of kmalloc. This also gives gcc more optimization potential because it can easier determine that there are no external aliases for these variables. The sdio initialization is a initialization now instead of memset. This allows gcc to break sdio into individual fields and optimize away unnecessary zeroing (after all the functions are inlined) Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:56 +02:00
Christoph Hellwig	62a3ddef61	vfs: fix spinning prevention in prune_icache_sb We need to move the inode to the end of the list to actually make the spinning prevention explained in the comment above it work. With a plain list_move it will simply stay in place as we're always reclaiming from the head of the list. Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:55 +02:00
Andreas Gruenbacher	948409c74d	vfs: add a comment to inode_permission() Acked-by: J. Bruce Fields <bfields@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andreas Gruenbacher <agruen@kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:55 +02:00
Andreas Gruenbacher	d124b60a83	vfs: pass all mask flags check_acl and posix_acl_permission Acked-by: J. Bruce Fields <bfields@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andreas Gruenbacher <agruen@kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:54 +02:00
Andreas Gruenbacher	8fd90c8d1d	vfs: indicate that the permission functions take all the MAY_* flags Acked-by: J. Bruce Fields <bfields@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andreas Gruenbacher <agruen@kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:54 +02:00
Eric W. Biederman	1448c721e4	compat: sync compat_stats with statfs. This was found by inspection while tracking a similar bug in compat_statfs64, that has been fixed in mainline since decemeber. - This fixes a bug where not all of the f_spare fields were cleared on mips and s390. - Add the f_flags field to struct compat_statfs - Copy f_flags to userspace in case someone cares. - Use __clear_user to copy the f_spare field to userspace to ensure that all of the elements of f_spare are cleared. On some architectures f_spare is has 5 ints and on some architectures f_spare only has 4 ints. Which makes the previous technique of clearing each int individually broken. I don't expect anyone actually uses the old statfs system call anymore but if they do let them benefit from having the compat and the native version working the same. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 14:58:53 +02:00
Bryan Schumaker	a877ee03ac	vfs: add "device" tag to /proc/self/mountstats nfsiostat was failing to find mounted filesystems on kernels after 2.6.38 because of changes to show_vfsstat() by commit c7f404b40a3665d9f4e9a927cc5c1ee0479ed8f9. This patch adds back the "device" tag before the nfs server entry so scripts can parse the mountstats file correctly. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> CC: stable@kernel.org [>=2.6.39] Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 13:55:08 +02:00
Wang Sheng-Hui	814e1d25a5	cleanup: vfs: small comment fix for block_invalidatepage The patch is aganist 3.1-rc3. Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-10-28 13:55:08 +02:00
Steve French	96814ecb40	Add definition for share encryption Samba supports a setfs info level to negotiate encrypted shares. This patch adds the defines so we recognize this info level. Later patches will add the enablement for it. Acked-by: Jeremy Allison <jra@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2011-10-27 16:53:31 -05:00
Eric Gouriou	80e675f906	ext4: optimize memmmove lengths in extent/index insertions ext4_ext_insert_extent() (respectively ext4_ext_insert_index()) was using EXT_MAX_EXTENT() (resp. EXT_MAX_INDEX()) to determine how many entries needed to be moved beyond the insertion point. In practice this means that (320 - I) * 24 bytes were memmove()'d when I is the insertion point, rather than (#entries - I) * 24 bytes. This patch uses EXT_LAST_EXTENT() (resp. EXT_LAST_INDEX()) instead to only move existing entries. The code flow is also simplified slightly to highlight similarities and reduce code duplication in the insertion logic. This patch reduces system CPU consumption by over 25% on a 4kB synchronous append DIO write workload when used with the pre-2.6.39 x86_64 memmove() implementation. With the much faster 2.6.39 memmove() implementation we still see a decrease in system CPU usage between 2% and 7%. Note that the ext_debug() output changes with this patch, splitting some log information between entries. Users of the ext_debug() output should note that the "move %d" units changed from reporting the number of bytes moved to reporting the number of entries moved. Signed-off-by: Eric Gouriou <egouriou@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-27 11:52:18 -04:00
Eric Gouriou	6f91bc5fda	ext4: optimize ext4_ext_convert_to_initialized() This patch introduces a fast path in ext4_ext_convert_to_initialized() for the case when the conversion can be performed by transferring the newly initialized blocks from the uninitialized extent into an adjacent initialized extent. Doing so removes the expensive invocations of memmove() which occur during extent insertion and the subsequent merge. In practice this should be the common case for clients performing append writes into files pre-allocated via fallocate(FALLOC_FL_KEEP_SIZE). In such a workload performed via direct IO and when using a suboptimal implementation of memmove() (x86_64 prior to the 2.6.39 rewrite), this patch reduces kernel CPU consumption by 32%. Two new trace points are added to ext4_ext_convert_to_initialized() to offer visibility into its operations. No exit trace point has been added due to the multiplicity of return points. This can be revisited once the upstream cleanup is backported. Signed-off-by: Eric Gouriou <egouriou@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-10-27 11:43:23 -04:00

... 2 3 4 5 6 ...

24808 Commits