linux

iv/linux

Author	SHA1	Message	Date
Barry Naujok	d3689d7687	[XFS] kmem_free and kmem_realloc to use const void * SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31212a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-07-28 16:58:43 +10:00
Barry Naujok	189f4bf22b	[XFS] XFS: ASCII case-insensitive support Implement ASCII case-insensitive support. It's primary purpose is for supporting existing filesystems that already use this case-insensitive mode migrated from IRIX. But, if you only need ASCII-only case-insensitive support (ie. English only) and will never use another language, then this mode is perfectly adequate. ASCII-CI is implemented by generating hashes based on lower-case letters and doing lower-case compares. It implements a new xfs_nameops vector for doing the hashes and comparisons for all filename operations. To create a filesystem with this CI mode, use: # mkfs.xfs -n version=ci <device> SGI-PV: 981516 SGI-Modid: xfs-linux-melb:xfs-kern:31209a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-07-28 16:58:42 +10:00
Barry Naujok	384f3ced07	[XFS] Return case-insensitive match for dentry cache This implements the code to store the actual filename found during a lookup in the dentry cache and to avoid multiple entries in the dcache pointing to the same inode. To avoid polluting the dcache, we implement a new directory inode operations for lookup. xfs_vn_ci_lookup() stores the correct case name in the dcache. The "actual name" is only allocated and returned for a case- insensitive match and not an actual match. Another unusual interaction with the dcache is not storing negative dentries like other filesystems doing a d_add(dentry, NULL) when an ENOENT is returned. During the VFS lookup, if a dentry returned has no inode, dput is called and ENOENT is returned. By not doing a d_add, this actually removes it completely from the dcache to be reused. create/rename have to be modified to support unhashed dentries being passed in. SGI-PV: 981521 SGI-Modid: xfs-linux-melb:xfs-kern:31208a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-07-28 16:58:40 +10:00
Barry Naujok	9403540c06	dcache: Add case-insensitive support d_ci_add() routine This add a dcache entry to the dcache for lookup, but changing the name that is associated with the entry rather than the one passed in to the lookup routine. First, it sees if the case-exact match already exists in the dcache and uses it if one exists. Otherwise, it allocates a new node with the new name and splices it into the dcache. Original code from ntfs_lookup in fs/ntfs/namei.c by Anton Altaparmakov. Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Acked-by: Christoph Hellwig <hch@infradead.org>	2008-07-28 16:58:39 +10:00
Barry Naujok	6a178100ab	[XFS] Add op_flags field and helpers to xfs_da_args The end of the xfs_da_args structure has 4 unsigned char fields for true/false information on directory and attr operations using the xfs_da_args structure. The following converts these 4 into a op_flags field that uses the first 4 bits for these fields and allows expansion for future operation information (eg. case-insensitive lookup request). SGI-PV: 981520 SGI-Modid: xfs-linux-melb:xfs-kern:31206a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-07-28 16:58:37 +10:00
Barry Naujok	5163f95a08	[XFS] Name operation vector for hash and compare Adds two pieces of functionality for the basis of case-insensitive support in XFS: 1. A comparison result enumerated type: xfs_dacmp. It represents an exact match, case-insensitive match or no match at all. This patch only implements different and exact results. 2. xfs_nameops vector for specifying how to perform the hash generation of filenames and comparision methods. In this patch the hash vector points to the existing xfs_da_hashname function and the comparison method does a length compare, and if the same, does a memcmp and return the xfs_dacmp result. All filename functions that use the hash (create, lookup remove, rename, etc) now use the xfs_nameops.hashname function and all directory lookup functions also use the xfs_nameops.compname function. The lookup functions also handle case-insensitive results even though the default comparison function cannot return that. And important aspect of the lookup functions is that an exact match always has precedence over a case-insensitive. So while a case-insensitive match is found, we have to keep looking just in case there is an exact match. In the meantime, the info for the first case-insensitive match is retained if no exact match is found. SGI-PV: 981519 SGI-Modid: xfs-linux-melb:xfs-kern:31205a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-07-28 16:58:36 +10:00
Eric Sandeen	68f34d5107	[XFS] de-duplicate calls to xfs_attr_trace_enter Every call to xfs_attr_trace_enter() shares the exact same 16 args in the middle... just send in the context pointer and let the next level down split it into the ktrace. Compile tested only. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:31200a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:35 +10:00
Christoph Hellwig	120226c11a	[XFS] add missing call to xfs_filestream_unmount on xfs_mountfs failure SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31199a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:33 +10:00
Christoph Hellwig	effa2eda3a	[XFS] rename error2 goto label in xfs_fs_fill_super SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31198a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:31 +10:00
Christoph Hellwig	95db4e21b7	[XFS] kill calls to xfs_binval in the mount error path xfs_binval aka xfs_flush_buftarg is the first thing done in xfs_free_buftarg, so there is no need to have duplicated calls just before xfs_free_buftarg in the mount failure path. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31197a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:30 +10:00
Christoph Hellwig	c962fb7902	[XFS] kill xfs_mount_init xfs_mount_init is inlined into xfs_fs_fill_super and allocation switched to kzalloc. Plug a leak of the mount structure for most early mount failures. Move xfs_icsb_init_counters to as late as possible in the mount path and make sure to undo it so that no stale hotplug cpu notifiers are left around on mount failures. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31196a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:29 +10:00
Christoph Hellwig	bdd907bab7	[XFS] allow xfs_args_allocate to fail Switch xfs_args_allocate to kzalloc and handle failures. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31195a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:27 +10:00
Christoph Hellwig	e34b562c6b	[XFS] add xfs_setup_devices helper Split setting the block and sector size out of xfs_fs_fill_super into a small helper to make xfs_fs_fill_super more readable. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31194a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:26 +10:00
Christoph Hellwig	19f354d4c3	[XFS] sort out opening and closing of the block devices Currently closing the rt/log block device is done in the wrong spot, and far too early. So revampt it: - xfs_blkdev_put moved out of xfs_free_buftarg into the caller so that it is done after tearing down the buftarg completely. - call to xfs_unmountfs_close moved from xfs_mountfs into caller so that it's done after tearing down the filesystem completely. - xfs_unmountfs_close is renamed to xfs_close_devices and made static in xfs_super.c - opening of the block devices is split into a helper xfs_open_devices that is symetric in use to xfs_close_devices - xfs_unmountfs can now lose struct cred - error handling around device opening sanitized in xfs_fs_fill_super SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31193a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:25 +10:00
Christoph Hellwig	af15b8953a	[XFS] don't call xfs_freesb from xfs_mountfs failure case Freeing of the superblock is already handled in the caller, and that is more symmetric with the mount path, too. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31192a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:23 +10:00
Christoph Hellwig	f8f15e42b4	[XFS] merge xfs_mount into xfs_fs_fill_super xfs_mount is already pretty linux-specific so merge it into xfs_fs_fill_super to allow for a more structured mount code in the next patches. xfs_start_flags and xfs_finish_flags also move to xfs_super.c. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31189a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:21 +10:00
Christoph Hellwig	e48ad3160e	[XFS] merge xfs_unmount into xfs_fs_put_super / xfs_fs_fill_super xfs_unmount is small and already pretty Linux specific, so merge it into the callers. The real unmount path is simplified a little by doing a WARN_ON on the xfs_unmount_flush retval directly instead of propagating the error back to the caller, and the mout failure case in simplified significantly by removing the forced shutdown case and all the dmapi events that shouldn't be sent because the dmapi mount event hasn't been sent by that time either. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31188a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:20 +10:00
Christoph Hellwig	61436febae	[XFS] kill xfs_igrow_start and xfs_igrow_finish xfs_igrow_start just expands to xfs_zero_eof with two asserts that are useless in the context of the only caller and some rather confusing comments. xfs_igrow_finish is just a few lines of code decorated again with useless asserts and confusing comments. Just kill those two and merge them into xfs_setattr. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31186a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:18 +10:00
Christoph Hellwig	48b62a1a97	[XFS] merge xfs_mntupdate into xfs_fs_remount xfs_mntupdate already is completely Linux specific due to the VFS flags passed in, so it might aswell be merged into xfs_fs_remount. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31185a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:17 +10:00
Christoph Hellwig	fa6adbe088	[XFS] kill xfs_uuid_unmount Quite useless wrapper that doesn't help making the code more readable. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31184a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:16 +10:00
David Chinner	4b166de0a0	[XFS] Update valid fields in xfs_mount_log_sb() Recent changes to update the version number during mount (attr2 stuff) failed to change the assert that checked for calid flags being changed on mount. Clearly this path hasn't been exercised by the test code.... SGI-PV: 981950 SGI-Modid: xfs-linux-melb:xfs-kern:31183a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:14 +10:00
Christoph Hellwig	911ee3de3d	[XFS] Kill attr_capable checks as already done in xattr_permission. No need for addition permission checks in the xattr handler, fs/xattr.c:xattr_permission() already does them, and in fact slightly more strict then what was in the attr_capable handlers. SGI-PV: 981809 SGI-Modid: xfs-linux-melb:xfs-kern:31164a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:13 +10:00
Matthew Wilcox	d748c62367	[XFS] Convert l_flushsema to a sv_t The l_flushsema doesn't exactly have completion semantics, nor mutex semantics. It's used as a list of tasks which are waiting to be notified that a flush has completed. It was also being used in a way that was potentially racy, depending on the semaphore implementation. By using a sv_t instead of a semaphore we avoid the need for a separate counter, since we know we just need to wake everything on the queue. Original waitqueue implementation from Matthew Wilcox. Cleanup and conversion to sv_t by Christoph Hellwig. SGI-PV: 981507 SGI-Modid: xfs-linux-melb:xfs-kern:31059a Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:12 +10:00
Michael Nishimoto	d729eae893	[XFS] Ensure that 2 GiB xfs logs work properly. We found this while experimenting with 2GiB xfs logs. The previous code never assumed that xfs logs would ever get so large. SGI-PV: 981502 SGI-Modid: xfs-linux-melb:xfs-kern:31058a Signed-off-by: Michael Nishimoto <miken@agami.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:11 +10:00
Denys Vlasenko	b41759cf11	[XFS] Remove unused wbc parameter from xfs_start_page_writeback() SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31057a Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:09 +10:00
Denys Vlasenko	4f0e8a9816	[XFS] Remove unused Falgs parameter from xfs_qm_dqpurge() SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31056a Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:08 +10:00
Denys Vlasenko	f0e2d93c29	[XFS] Remove unused arg from kmem_free() kmem_free() function takes (ptr, size) arguments but doesn't actually use second one. This patch removes size argument from all callsites. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31050a Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:07 +10:00
Tim Shimmin	7c12f29650	[XFS] Fix up noattr2 so that it will properly update the versionnum and features2 fields. Previously, mounting with noattr2 failed to achieve anything because although it cleared the attr2 mount flag, it would set it again as soon as it processed the superblock fields. The fix now has an explicit noattr2 flag and uses it later to fix up the versionnum and features2 fields. SGI-PV: 980021 SGI-Modid: xfs-linux-melb:xfs-kern:31003a Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:05 +10:00
Barry Naujok	f9f6dce019	[XFS] Split xfs_dir2_leafn_lookup_int into its two pieces of functionality SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30834a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:58:03 +10:00
Aneesh Kumar K.V	5e745b041f	ext4: Fix small file fragmentation For small file block allocations, mballoc uses per cpu prealloc space. Use goal block when searching for the right prealloc space. Also make sure ext4_da_writepages tries to write all the pages for small files in single attempt Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-18 18:00:57 -04:00
Aneesh Kumar K.V	91246c0090	ext4: Initialize writeback_index to 0 when allocating a new inode The write_cache_pages() function uses the mapping->writeback_index as the starting index to write out when range_cyclic is set. Properly initialize writeback_index so that we start the writeout at index 0. This was found when debugging the small file fragmentation on ext4. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 21:14:52 -04:00
Aneesh Kumar K.V	16eb729564	ext4: make sure ext4_has_free_blocks returns 0 for ENOSPC Fix ext4_has_free_blocks() to return 0 when we don't have enough space. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 21:16:54 -04:00
Mingming Cao	525f4ed8dc	ext4: journal credit fix for the delayed allocation's writepages() function Previous delalloc writepages implementation started a new transaction outside of a loop which called get_block() to do the block allocation. Since we didn't know exactly how many blocks would need to be allocated, the estimated journal credits required was very conservative and caused many issues. With the reworked delayed allocation, a new transaction is created for each get_block(), thus we don't need to guess how many credits for the multiple chunk of allocation. We start every transaction with enough credits for inserting a single exent. When estimate the credits for indirect blocks to allocate a chunk of blocks, we need to know the number of data blocks to allocate. We use the total number of reserved delalloc datablocks; if that is too big, for non-extent files, we need to limit the number of blocks to EXT4_MAX_TRANS_BLOCKS. Code cleanup from Aneesh. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 22:15:58 -04:00
Aneesh Kumar K.V	a1d6cc563b	ext4: Rework the ext4_da_writepages() function With the below changes we reserve credit needed to insert only one extent resulting from a call to single get_block. This makes sure we don't take too much journal credits during writeout. We also don't limit the pages to write. That means we loop through the dirty pages building largest possible contiguous block request. Then we issue a single get_block request. We may get less block that we requested. If so we would end up not mapping some of the buffer_heads. That means those buffer_heads are still marked delay. Later in the writepage callback via __mpage_writepage we redirty those pages. We should also not limit/throttle wbc->nr_to_write in the filesystem writepages callback. That cause wrong behaviour in generic_sync_sb_inodes caused by wbc->nr_to_write being <= 0 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 21:55:02 -04:00
Mingming Cao	f3bd1f3fa8	ext4: journal credits reservation fixes for DIO, fallocate DIO and fallocate credit calculation is different than writepage, as they do start a new journal right for each call to ext4_get_blocks_wrap(). This patch uses the helper function in DIO and fallocate case, passing a flag indicating that the modified data are contigous thus could account less indirect/index blocks. This patch also fixed the journal credit reservation for direct I/O (DIO). Previously the estimated credits for DIO only was calculated for non-extent files, which was not enough if the file is extent-based. Also fixed was fallocate double-counting credits for modifying the the superblock. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 22:16:03 -04:00
Mingming Cao	ee12b63068	ext4: journal credits reservation fixes for extent file writepage This patch modified the writepage/write_begin credit calculation for extent files, to use the credits caculation helper function. The current calculation of how many index/leaf blocks should be accounted is too conservetive, it always considered the worse case, where the tree level is 5, and in the case of multiple chunk allocations, it always assumed no blocks were dirtied in common across the allocations. This path uses the accurate depth of the inode with some extras to calculate the index blocks, and also less conservative in the case of multiple allocation accounting. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 22:16:05 -04:00
Mingming Cao	a02908f19c	ext4: journal credits calulation cleanup and fix for non-extent writepage When considering how many journal credits are needed for modifying a chunk of data, we need to account for the super block, inode block, quota blocks and xattr block, indirect/index blocks, also, group bitmap and group descriptor blocks for new allocation (including data and indirect/index blocks). There are many places in ext4 do the calculation on their own and often missed one or two meta blocks, and often they assume single block allocation, and did not considering the multile chunk of allocation case. This patch is trying to cleanup current journal credit code, provides some common helper funtion to calculate the journal credits, to be used for writepage, writepages, DIO, fallocate, migration, defrag, and for both nonextent and extent files. This patch modified the writepage/write_begin credit caculation for nonextent files, to use the new helper function. It also fixed the problem that writepage on nonextent files did not consider the case blocksize <pagesize, thus could possibelly need multiple block allocation in a single transaction. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 22:16:07 -04:00
Eric Sandeen	c001077f40	ext4: Fix bug where we return ENOSPC even though we have plenty of inodes The find_group_flex() function starts with best_flex as the parent_fbg_group, which happens to have 0 inodes free. Some of the flex groups searched have free blocks and free inodes, but the flex_freeb_ratio is < 10, so they're skipped. Then when a group is compared to the current "best" flex group, it does not have more free blocks than "best", so it is skipped as well. This continues until no flex group with free inodes is found which has a proper ratio or which has more free blocks than the "best" group, and we're left with a "best" group that has 0 inodes free, and we return -ENOSPC. We fix this by changing the logic so that if the current "best" flex group has no inodes free, and the current one does have room, it is promoted to the next "best." Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 22:19:50 -04:00
Josef Bacik	37609fd5ae	ext4: don't try to resize if there are no reserved gdt blocks left When trying to resize an ext4 fs and you run out of reserved gdt blocks, you get an error that doesn't actually tell you what went wrong, it just says that the gdb it picked is not correct, which is the case since you don't have any reserved gdt blocks left. This patch adds a check to make sure you have reserved gdt blocks to use, and if not prints out a more relevant error. Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Andreas Dilger <adilger@sun.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 22:13:41 -04:00
Theodore Ts'o	88aa3cff4e	ext4: Use ext4_discard_reservations instead of mballoc-specific call In ext4_ext_truncate(), we should use the more generic ext4_discard_reservations() call so we do the right thing when the filesystem is mounted with the nomballoc option. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Mingming Cao <cmm@us.ibm.com>	2008-08-16 07:57:35 -04:00
Theodore Ts'o	d015641734	ext4: Fix ext4_dx_readdir hash collision handling This fixes a bug where readdir() would return a directory entry twice if there was a hash collision in an hash tree indexed directory. Signed-off-by: Eugene Dashevsky <eugene@ibrix.com> Signed-off-by: Mike Snitzer <msnitzer@ibrix.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 21:57:43 -04:00
Mingming Cao	cd21322616	ext4: Fix delalloc release block reservation for truncate Ext4 will release the reserved blocks for delayed allocations when inode is truncated/unlinked. If there is no reserved block at all, we shouldn't need to do so. But current code still tries to release the reserved blocks regardless whether the counters's value is 0. Continue to do that causes the later calculation to go wrong and a kernel BUG_ON() caught that. This doesn't happen for extent-based files, as the calculation for 0 reserved blocks was right for extent based file. This patch fixed the kernel BUG() due to above reason. It adds checks for 0 to avoid unnecessary release and fix calculation for non-extent files. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 22:16:59 -04:00
Theodore Ts'o	b4df203085	ext4: Fix potential truncate BUG due to i_prealloc_list being non-empty We need to call ext4_discard_reservation() earlier in ext4_truncate(), to avoid a BUG() in ext4_mb_return_to_preallocation(), which is called (ultimately) by ext4_free_blocks(). So we must ditch the blocks on i_prealloc_list before we start freeing the data blocks. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-13 21:44:34 -04:00
Aneesh Kumar K.V	bf068ee266	ext4: Handle unwritten extent properly with delayed allocation When using fallocate the buffer_heads are marked unwritten and unmapped. We need to map them in the writepages after a get_block. Otherwise we split the uninit extents, but never write the content to disk. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-19 22:16:43 -04:00
Linus Torvalds	c9272c4f9f	Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Ensure we call nfs_sb_deactive() after releasing the directory inode nfs_remount oops when rebooting + possible fix	2008-07-27 16:47:55 -07:00
Andrea Righi	940389b8af	task IO accounting: move all IO statistics in struct task_io_accounting Simplify the code of include/linux/task_io_accounting.h. It is also more reasonable to have all the task i/o-related statistics in a single struct (task_io_accounting). Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-27 16:12:28 -07:00
Trond Myklebust	744d18dbfa	NFS: Ensure we call nfs_sb_deactive() after releasing the directory inode In order to avoid the "Busy inodes after unmount" error message, we need to ensure that nfs_async_unlink_release() releases the super block after the call to nfs_free_unlinkdata(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-27 18:20:51 -04:00
Marc Zyngier	31c9446993	nfs_remount oops when rebooting + possible fix Jeff, Trond, The commit 48b605f83c920d8daa50e43fc2c7f718e04c7bfa (NFS: implement option checking when remounting NFS filesystems (resend)) generate an Oops on my platform when rebooting while its root FS on an NFS share (NFSv3, TCP) : Unmounting local filesystems...done. Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c3d00000 [00000000] pgd=a3d72031, pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] Modules linked in: cpufreq_powersave cpufreq_ondemand cpufreq_userspace cpufreq_conservative ext3 jbd sd_mod pata_pcmcia libata scsi_mod pcmcia loop firmware_class pxafb cfbcopyarea cfbimgblt cfbfillrect pxa2xx_cs pxa2xx_core pcmcia_core snd_pxa2xx_ac97 snd_ac97_codec ac97_bus snd_pxa2xx_pcm snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd isp116x_hcd soundcore rtc_sa1100 snd_page_alloc pxa25x_udc usbcore rtc_ds1307 rtc_core CPU: 0 Not tainted (2.6.26-03414-g33af79d-dirty #15) PC is at nfs_remount+0x40/0x264 LR is at do_remount_sb+0x158/0x194 pc : [<c00bbf54>] lr : [<c0076c40>] psr: 60000013 sp : c2dd1e70 ip : c2dd1e98 fp : c2dd1e94 r10: 00000040 r9 : c3d17000 r8 : c3c3fc40 r7 : 00000000 r6 : 00000000 r5 : c3d2b200 r4 : 00000000 r3 : 00000003 r2 : 00000000 r1 : c2dd1e9c r0 : c3c3fc00 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0000397f Table: a3d00000 DAC: 00000015 Process mount (pid: 1462, stack limit = 0xc2dd0270) Stack: (0xc2dd1e70 to 0xc2dd2000) 1e60: 00000000 c3c3fc00 00000000 00000000 1e80: c3c3fc40 c3d17000 c2dd1ebc c2dd1e98 c0076c40 c00bbf20 c01c61e4 00000001 1ea0: c2dd1ebc 00000001 c3c3fc00 c2dd1ef0 c2dd1ee4 c2dd1ec0 c008c6d8 c0076af4 1ec0: 00000021 00000040 c2dd1ef0 c3d77000 c3eaa000 00000000 c2dd1f6c c2dd1ee8 1ee0: c008d1bc c008c5f8 00000000 c2dd0000 c3c0c320 c3805b38 c002064c 0001f820 1f00: 0001f810 00000001 00000001 00000000 c2dd0000 00000000 c2dd1f34 c2dd1f28 1f20: c005ead8 c005e6f8 c2dd1f44 c2dd1f38 c005eaf8 c005ead0 c2dd1f6c c2dd1f48 1f40: c008ae3c 00000000 c3d77000 0001f810 c0ed0021 c0020ca8 c2dd0000 00000000 1f60: c2dd1fa4 c2dd1f70 c008d2d4 c008d0bc 00000000 0001f810 c2dd1f9c c3eaa000 1f80: c3d17000 00000000 00000000 be8b6aa8 be8b6ad0 00000015 00000000 c2dd1fa8 1fa0: c0020b00 c008d254 00000000 be8b6aa8 0001f810 0001f820 0001f830 c0ed0021 1fc0: 00000000 be8b6aa8 be8b6ad0 00000015 00000000 be8b6ad0 0001f810 be8b6aa8 1fe0: 0001f810 be8b6964 0000aab8 40125124 60000010 0001f810 00000000 00000000 Backtrace: [<c00bbf14>] (nfs_remount+0x0/0x264) from [<c0076c40>] (do_remount_sb+0x158/0x194) r9:c3d17000 r8:c3c3fc40 r7:00000000 r6:00000000 r5:c3c3fc00 r4:00000000 [<c0076ae8>] (do_remount_sb+0x0/0x194) from [<c008c6d8>] (do_remount+0xec/0x118) r6:c2dd1ef0 r5:c3c3fc00 r4:00000001 [<c008c5ec>] (do_remount+0x0/0x118) from [<c008d1bc>] (do_mount+0x10c/0x198) [<c008d0b0>] (do_mount+0x0/0x198) from [<c008d2d4>] (sys_mount+0x8c/0xd4) [<c008d248>] (sys_mount+0x0/0xd4) from [<c0020b00>] (ret_fast_syscall+0x0/0x2c) r7:00000015 r6:be8b6ad0 r5:be8b6aa8 r4:00000000 Code: 0a000086 ea000006 e3530003 8a000004 (e5923000) ---[ end trace 55e1b689cf8c8a6a ]--- ------------[ cut here ]------------ WARNING: at kernel/exit.c:966 do_exit+0x3c/0x628() Modules linked in: cpufreq_powersave cpufreq_ondemand cpufreq_userspace cpufreq_conservative ext3 jbd sd_mod pata_pcmcia libata scsi_mod pcmcia loop firmware_class pxafb cfbcopyarea cfbimgblt cfbfillrect pxa2xx_cs pxa2xx_core pcmcia_core snd_pxa2xx_ac97 snd_ac97_codec ac97_bus snd_pxa2xx_pcm snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd isp116x_hcd soundcore rtc_sa1100 snd_page_alloc pxa25x_udc usbcore rtc_ds1307 rtc_core [<c0025168>] (dump_stack+0x0/0x14) from [<c0032154>] (warn_on_slowpath+0x4c/0x68) [<c0032108>] (warn_on_slowpath+0x0/0x68) from [<c003531c>] (do_exit+0x3c/0x628) r6:0000000b r5:c3c3dc80 r4:c2dd0000 [<c00352e0>] (do_exit+0x0/0x628) from [<c0025004>] (die+0x2b0/0x30c) [<c0024d54>] (die+0x0/0x30c) from [<c00270bc>] (__do_kernel_fault+0x6c/0x80) [<c0027050>] (__do_kernel_fault+0x0/0x80) from [<c00272e0>] (do_page_fault+0x210/0x230) r7:c3fa7118 r6:c3c3dc80 r5:c3d166a8 r4:00010000 [<c00270d0>] (do_page_fault+0x0/0x230) from [<c00201ec>] (do_DataAbort+0x3c/0xa0) [<c00201b0>] (do_DataAbort+0x0/0xa0) from [<c002064c>] (__dabt_svc+0x4c/0x60) Exception stack(0xc2dd1e28 to 0xc2dd1e70) 1e20: c3c3fc00 c2dd1e9c 00000000 00000003 00000000 c3d2b200 1e40: 00000000 00000000 c3c3fc40 c3d17000 00000040 c2dd1e94 c2dd1e98 c2dd1e70 1e60: c0076c40 c00bbf54 60000013 ffffffff r8:c3c3fc40 r7:00000000 r6:00000000 r5:c2dd1e5c r4:ffffffff [<c00bbf14>] (nfs_remount+0x0/0x264) from [<c0076c40>] (do_remount_sb+0x158/0x194) r9:c3d17000 r8:c3c3fc40 r7:00000000 r6:00000000 r5:c3c3fc00 r4:00000000 [<c0076ae8>] (do_remount_sb+0x0/0x194) from [<c008c6d8>] (do_remount+0xec/0x118) r6:c2dd1ef0 r5:c3c3fc00 r4:00000001 [<c008c5ec>] (do_remount+0x0/0x118) from [<c008d1bc>] (do_mount+0x10c/0x198) [<c008d0b0>] (do_mount+0x0/0x198) from [<c008d2d4>] (sys_mount+0x8c/0xd4) [<c008d248>] (sys_mount+0x0/0xd4) from [<c0020b00>] (ret_fast_syscall+0x0/0x2c) r7:00000015 r6:be8b6ad0 r5:be8b6aa8 r4:00000000 ---[ end trace 55e1b689cf8c8a6a ]--- /etc/rc6.d/S60umountroot: line 17: 1462 Segmentation fault mount $MOUNT_FORCE_OPT -n -o remount,ro -t dummytype dummydev / 2> /dev/null The new super.c:nfs_remount function doesn't check the validity of the options/options4 pointers. Unfortunately, this seems to happend. The obvious patch seems to check the pointers, and not to do anything if the happend to be NULL. Tested on an XScale PXA255 system, latest git. Regards, M. Signed-off-by: Marc Zyngier <marc.zyngier@altran.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-27 18:20:41 -04:00
Andrea Righi	5995477ab7	task IO accounting: improve code readability Put all i/o statistics in struct proc_io_accounting and use inline functions to initialize and increment statistics, removing a lot of single variable assignments. This also reduces the kernel size as following (with CONFIG_TASK_XACCT=y and CONFIG_TASK_IO_ACCOUNTING=y). text data bss dec hex filename 11651 0 0 11651 2d83 kernel/exit.o.before 11619 0 0 11619 2d63 kernel/exit.o.after 10886 132 136 11154 2b92 kernel/fork.o.before 10758 132 136 11026 2b12 kernel/fork.o.after 3082029 807968 4818600 8708597 84e1f5 vmlinux.o.before 3081869 807968 4818600 8708437 84e155 vmlinux.o.after Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-27 09:58:20 -07:00
Linus Torvalds	9ee08c2df4	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: (57 commits) [MTD] [NAND] subpage read feature as a way to increase performance. CPUFREQ: S3C24XX NAND driver frequency scaling support. [MTD][NAND] au1550nd: remove unused variable [MTD] jedec_probe: Fix SST 16-bit chip detection [MTD][MTDPART] Fix a division by zero bug [MTD][MTDPART] Cleanup and document the erase region handling [MTD][MTDPART] Handle most checkpatch findings [MTD][MTDPART] Seperate main loop from per-partition code in add_mtd_partition [MTD] physmap: resume already suspended chips on failure to suspend [MTD] physmap: Fix suspend/resume/shutdown bugs. [MTD] [NOR] Fix -ETIMEO errors in CFI driver [MTD] [NAND] fsl_elbc_nand: fix section mismatch with CONFIG_MTD_OF_PARTS=y [JFFS2] Use .unlocked_ioctl [MTD] Fix const assignment in the MTD command line partitioning driver [MTD] [NOR] gen_probe: No debug message when debugging is disabled [MTD] [NAND] remove __PPC__ hardcoded address from DiskOnChip drivers [MTD] [MAPS] Remove the bast-flash driver. [MTD] [NAND] fsl_elbc_nand: ecclayout cleanups [MTD] [NAND] fsl_elbc_nand: implement support for flash-based BBT [MTD] [NAND] fsl_elbc_nand: fix OOB workability for large page NAND chips ...	2008-07-26 20:30:56 -07:00

... 3 4 5 6 7 ...

9906 Commits