Commit Graph

264 Commits

Author SHA1 Message Date
Trond Myklebust
f0dd2136da [PATCH] NFS: Clean up readdir changes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:34 -04:00
Olivier Galibert
00a9264227 [PATCH] NFS: Hide NFS server-generated readdir cookies from userland
NFSv3 currently returns the unsigned 64-bit cookie directly to
 userspace. The following patch causes the kernel to generate
 loff_t offsets for the benefit of userland.
 The current server-generated READDIR cookie is cached in the
 nfs_open_context instead of in filp->f_pos, so we still end up work
 correctly under directory insertions/deletion.

 Signed-off-by: Olivier Galibert <galibert@pobox.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:33 -04:00
Trond Myklebust
455a396710 [PATCH] NFSv4: Fix an Oops in the callback code.
The changeset "trond.myklebust@fys.uio.no|ChangeSet|20050322152404|16979"
 (RPC: Ensure XDR iovec length is initialized correctly in call_header)
 causes the NFSv4 callback code to BUG() due to an incorrectly initialized
 scratch buffer.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:29 -04:00
Reuben Farrelly
c56c275022 [PATCH] NFSv4: Fix build warning
From: Reuben Farrelly <reuben-lkml@reub.net>

 With gcc-4.0:

 fs/nfs/nfs4proc.c:2976: error: static declaration of
 'nfs4_file_inode_operations' follows non-static declaration
 fs/nfs/nfs4_fs.h:179: error: previous declaration of
 'nfs4_file_inode_operations' was here

 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:29 -04:00
Andrew Morton
3e9d41543b [PATCH] NFSv4: empty array fix
Older gcc's don't like this.

 fs/nfs/nfs4proc.c:2194: field `data' has incomplete type

 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:28 -04:00
Adrian Bunk
b7ef19560f [PATCH] NFSv4: fs/nfs/nfs4proc.c: small simplification
The Coverity checker noticed that such a simplification was possible.

 Signed-off-by: Adrian Bunk <bunk@stusta.de>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:27 -04:00
Andreas Gruenbacher
213484254c [PATCH] fix nfsacl pointer arithmetic and pg_class initialization bugs
* Pointer arithmetic bug: p is in word units. This fixes a memory
  corruption with big acls.
* Initialize pg_class to prevent a NULL pointer access.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:27 -04:00
Trond Myklebust
458818ed76 [PATCH] NFS: Fix up v3 ACL caching code
Initialize the inode cache values correctly.
 Clean up __nfs3_forget_cached_acls()

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:26 -04:00
Andreas Gruenbacher
5c6a9f7d92 [PATCH] NFS: Cache the NFSv3 acls.
Attach acls to inodes in the icache to avoid unnecessary GETACL RPC
 round-trips.  As long as the client doesn't retrieve any acls itself, only the
 default acls of exiting directories and the default and access acls of new
 directories will end up in the cache, which preserves some memory compared to
 always caching the access and default acl of all files.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Acked-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:25 -04:00
Andreas Gruenbacher
055ffbea05 [PATCH] NFS: Fix handling of the umask when an NFSv3 default acl is present.
NFSv3 has no concept of a umask on the server side: The client applies
 the umask locally, and sends the effective permissions to the server.
 This behavior is wrong when files are created in a directory that has a
 default ACL.  In this case, the umask is supposed to be ignored, and
 only the default ACL determines the file's effective permissions.

 Usually its the server's task to conditionally apply the umask.  But
 since the server knows nothing about the umask, we have to do it on the
 client side.  This patch tries to fetch the parent directory's default
 ACL before creating a new file, computes the appropriate create mode to
 send to the server, and finally sets the new file's access and default
 acl appropriately.

 Many thanks to Buck Huppmann <buchk@pobox.com> for sending the initial
 version of this patch, as well as for arguing why we need this change.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Acked-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:24 -04:00
Andreas Gruenbacher
b7fa0554cf [PATCH] NFS: Add support for NFSv3 ACLs
This adds acl support fo nfs clients via the NFSACL protocol extension, by
 implementing the getxattr, listxattr, setxattr, and removexattr iops for the
 system.posix_acl_access and system.posix_acl_default attributes.  This patch
 implements a dumb version that uses no caching (and thus adds some overhead).
 (Another patch in this patchset adds caching as well.)

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Acked-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:24 -04:00
Andreas Gruenbacher
a257cdd0e2 [PATCH] NFSD: Add server support for NFSv3 ACLs.
This adds functions for encoding and decoding POSIX ACLs for the NFSACL
 protocol extension, and the GETACL and SETACL RPCs.  The implementation is
 compatible with NFSACL in Solaris.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Acked-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:23 -04:00
Andreas Gruenbacher
a838cc49d9 [PATCH] NFSD: Add NFS3ERR_NOTSUPP to the nfsd error mapping table
Add the missing NFS3ERR_NOTSUPP error code (defined in NFSv3) to the
 system-to-protocol-error table in nfsd.  The nfsacl extension uses this error
 code.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Signed-off-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:21 -04:00
J. Bruce Fields
6a19275ada [PATCH] RPC: [PATCH] improve rpcauthauth_create error returns
Currently we return -ENOMEM for every single failure to create a new auth.
 This is actually accurate for auth_null and auth_unix, but for auth_gss it's a
 bit confusing.

 Allow rpcauth_create (and the ->create methods) to return errors.  With this
 patch, the user may sometimes see an EINVAL instead.  Whee.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:16 -04:00
J. Bruce Fields
e50a1c2e1f [PATCH] NFSv4: client-side caching NFSv4 ACLs
Add nfs4_acl field to the nfs_inode, and use it to cache acls.  Only cache
 acls of size up to a page.  Also prepare for up to a page of acl data even
 when the user doesn't pass in a buffer, as when they want to get the acl
 length to decide what size buffer to allocate.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:15 -04:00
J. Bruce Fields
4b580ee3dc [PATCH] NFSv4: ACL support for the NFSv4 client: write
Client-side write support for NFSv4 ACLs.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:14 -04:00
J. Bruce Fields
23ec6965c2 [PATCH] NFSv4: Client-side xdr for writing NFSv4 acls
Client-side support for NFSv4 acls: xdr encoding and decoding routines for
 writing acls

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:13 -04:00
J. Bruce Fields
aa1870af92 [PATCH] NFSv4: ACL support for the NFSv4 client: read
Client-side support for NFSv4 ACLs.  Exports the raw xdr code via the
 system.nfs4_acl extended attribute.  It is up to userspace to decode the acl
 (and to provide correctly xdr'd acls on setxattr), and to convert to/from
 POSIX ACLs if desired.

 This patch provides only the read support.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:13 -04:00
J. Bruce Fields
029d105e66 [PATCH] NFSv4: Client-side xdr for reading NFSv4 acls
Client-side support for NFSv4 acls: xdr encoding and decoding routines for
 reading acls

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:12 -04:00
J. Bruce Fields
9692820696 [PATCH] NFSv4: fix fattr size calculations
Make nfs4 fattr size calculations more explicit, revising them downward a
 bit in the process.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:11 -04:00
J. Bruce Fields
6b3b5496d7 [PATCH] NFSv4: Add {get,set,list}xattr methods for nfs4
Add {get,set,list}xattr methods for nfs4.  The new methods are no-ops, to be
 used by subsequent ACL patch.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:10 -04:00
Trond Myklebust
ada70d9425 [PATCH] NFS: Add hooks to allow common NFS attribute code to clear cached acls
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:09 -04:00
J. Bruce Fields
92cfc62cb8 [PATCH] NFS: Allow NFS versions to support different sets of inode operations.
ACL support will require supporting additional inode operations in v4
 (getxattr, setxattr, listxattr).  This patch allows different protocol versions
 to support different inode operations by adding a file_inode_ops to the
 nfs_rpc_ops (to match the existing dir_inode_ops).

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:09 -04:00
Trond Myklebust
464a98bd70 [PATCH] NFS: cleanup: shrink struct nfs_open_context
Remove the wait queue, and replace the functions that depended on it
 with wait_on_bit().

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:08 -04:00
Trond Myklebust
a656db9987 [PATCH] NFS: Remove unused NFS inode field readdir_timestamp.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:07 -04:00
Trond Myklebust
4ce79717ce [PATCH] NFS: Header file cleanup...
- Move NFSv4 state definitions into a private header file.
 - Clean up gunk in nfs_fs.h

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:06 -04:00
Trond Myklebust
9085bbcb76 [PATCH] NFS: Kill annoying mount version mismatch printks
Ensure that we fix up the missing fields in the nfs_mount_data with
 sane defaults for older versions of mount, and return errors in the
 cases where we cannot.

 Convert a bunch of annoying warnings into dprintks()

 Return -EPROTONOSUPPORT rather than EIO if mount() tries to set NFSv3
 without it actually being compiled in.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:04 -04:00
Trond Myklebust
5ee0ed7d3a [PATCH] RPC: Make rpc_create_client() probe server for RPC program+version support
Ensure that we don't create an RPC client without checking that the server
 does indeed support the RPC program + version that we are trying to set up.

 This enables us to immediately return an error to "mount" if it turns out
 that the server is only supporting NFSv2, when we requested NFSv3 or NFSv4.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:04 -04:00
Trond Myklebust
5b616f5d59 [PATCH] RPC: Make rpc_create_client() destroy the transport on failure.
This saves us a couple of lines of cleanup code for each call.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:03 -04:00
Linus Torvalds
2a5a68b840 Merge rsync://oss.sgi.com/git/xfs-2.6 2005-06-21 19:51:18 -07:00
Jeremy White
9769f4eb3f [PATCH] isofs: show hidden files, add granularity for assoc/hidden files flags
The current isofs treatment of hidden files is flawed in two ways.  First,
it does not provide sufficient granularity; it hides both 'hidden' files
and 'associated' files (resource fork for Mac files).  Second, the default
behavior to completely strip hidden files, while an admirable
implementation of the spec, is a poor choice given the real world use of
hidden files as a poor mans copy protection scheme for MSDOS and Windows
based systems.  A longer description of this is available here:

   http://www.uwsg.iu.edu/hypermail/linux/kernel/0205.3/0267.html

This patch was originally built after a few private conversations with Alan
Cox; I shamefully failed to persist in seeing it go forward, I hope to make
amends now.

This patch introduces granularity by allowing explicit control for both
hidden and associated files.  It also reverses the default so that by
default, hidden files are treated as regular files on the iso9660 file
system.

This allow Wine to process Windows CDs, including those that are hybrid
Mac/Windows CDs properly and completely, without our having to go muck up
peoples fstabs as we do now.  (I have tested this with such a hybrid +
hidden CD and have verified that this patch works as claimed).

Signed-off-by: Jeremy White <jwhite@codeweavers.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
f2966632a1 [PATCH] rock: handle directory overflows
Handle the case where the variable-sized part of a rock-ridge directory entry
overhangs the end of the buffer which we allocated for it.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
642217c17b [PATCH] rock: rename union members
The silly thing does:

	struct foo { ... };
	...
	#define foo 42

so you can no longer refer to `struct foo' in C code.

Rename the structures.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
e595447e17 [PATCH] rock.c: handle corrupted directories
The bug in rock.c is that it's totally trusting of the contents of the
directories.  If the directory says there's a continuation 10000 bytes into
this 4k block then we cheerily poke around in memory we don't own and oops.

So change rock_continue() to apply various sanity checks, at least ensuring
that the offset+length remain within the bounds for the header part of a
struct rock_ridge directory entry.

Note that the kernel can still overindex the buffer due to the variable size
of the rock-ridge directory entries.  We cannot check that in rock_continue()
unless we go parse the directory entry's signature and work out its size.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
9eb7f2c67c [PATCH] isofs: remove debug stuff
isofs/inode.c:

- Remove some crufty leak detection code

- coding style cleanups

- kfree(NULL) is permitted.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
a089221c5e [PATCH] rock: lindent rock.h
So we have a couple of rock-ridge bugs.  First up, rotoroot the poor thing
into something which it is possible to work on.

Feed rock.h through Lindent, tidy a couple of things by hand.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
7373909de4 [PATCH] rock: comment tidies
Be a bit more standard in comment layout.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
ba40aaf043 [PATCH] rock: remove MAYBE_CONTINUE
- remove the MAYBE_CONTINUE macro

- kfree(NULL) is OK.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
76ab07ebc3 [PATCH] rock: remove SETUP_ROCK_RIDGE
- Remove the SETUP_ROCK_RIDGE macro.

- In rock_ridge_symlink_readpage(), rename raw_inode to raw_de.  It points
  at a directory entry, not an inode.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
04f7aa9c7d [PATCH] rock: remove CHECK_CE
Remove the CHECK_CE macro

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
a40ea8f22e [PATCH] rock: remove CONTINUE_DECLS
Remove the CONTINUE_DECLS macro.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:36 -07:00
Andrew Morton
12121714fb [PATCH] rock: remove CHECK_SP
Remove the CHECK_SP macro.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:36 -07:00
Andrew Morton
7fa393a1d3 [PATCH] rock: manual tidies
Fix stuff which Lindent got wrong, rework a few deeply-nested blocks.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:36 -07:00
Andrew Morton
1d37211638 [PATCH] rock: lindent it
Trying to turn rock.c into something which humans can read so we can fix some
bugs.

Start out by feeding it through scripts/Lindent.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:36 -07:00
Ian Kent
1684b2bba6 [PATCH] autofs4: bad lookup fix
For browsable autofs maps, a mount request that arrives at the same time an
expire is happening can fail to perform the needed mount.

This happens becuase the directory exists and so the revalidate succeeds when
we need it to fail so that lookup is called on the same dentry to do the
mount.  Instead lookup is called on the next path component which should be
whithin the mount, but the parent isn't mounted.

The solution is to allow the revalidate to continue and perform the mount as
no directory creation (at mount time) is needed for browsable mount entries.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:35 -07:00
Ian Kent
cc9acc8858 [PATCH] autofs4: post expire race fix
At the tail end of an expire it's possible for a process to enter
autofs4_wait, with a waitq type of NFY_NONE but find that the expire is
finished.  In this cause autofs4_wait will try to create a new wait but not
notify the daemon leading to a hang.  As the wait type is meant to delay mount
requests from revalidate or lookup during an expire and the expire is done all
we need to do is check if the dentry is a mountpoint.  If it's not then we're
done.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:35 -07:00
Ian Kent
9b1e3afd6d [PATCH] autofs4: avoid panic on bind mount of autofs owned directory
While this is not a solution to bind and move mounts on autofs owned
directories it is necessary to fix the trady error handling.

At least it avoids the kernel panic I observed checking out bug #4589.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:35 -07:00
Gerald Schaefer
8680e22f29 [PATCH] VFS: memory leak in do_kern_mount()
There is a memory leak during mount when CONFIG_SECURITY is enabled and
mount options are specified.

Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:22 -07:00
Darren Hart
1ad539b2bd [PATCH] vm: try_to_free_pages unused argument
try_to_free_pages accepts a third argument, order, but hasn't used it since
before 2.6.0.  The following patch removes the argument and updates all the
calls to try_to_free_pages.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:17 -07:00
Wolfgang Wander
1363c3cd86 [PATCH] Avoiding mmap fragmentation
Ingo recently introduced a great speedup for allocating new mmaps using the
free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
causes huge performance increases in thread creation.

The downside of this patch is that it does lead to fragmentation in the
mmap-ed areas (visible via /proc/self/maps), such that some applications
that work fine under 2.4 kernels quickly run out of memory on any 2.6
kernel.

The problem is twofold:

  1) the free_area_cache is used to continue a search for memory where
     the last search ended.  Before the change new areas were always
     searched from the base address on.

     So now new small areas are cluttering holes of all sizes
     throughout the whole mmap-able region whereas before small holes
     tended to close holes near the base leaving holes far from the base
     large and available for larger requests.

  2) the free_area_cache also is set to the location of the last
     munmap-ed area so in scenarios where we allocate e.g.  five regions of
     1K each, then free regions 4 2 3 in this order the next request for 1K
     will be placed in the position of the old region 3, whereas before we
     appended it to the still active region 1, placing it at the location
     of the old region 2.  Before we had 1 free region of 2K, now we only
     get two free regions of 1K -> fragmentation.

The patch addresses thes issues by introducing yet another cache descriptor
cached_hole_size that contains the largest known hole size below the
current free_area_cache.  If a new request comes in the size is compared
against the cached_hole_size and if the request can be filled with a hole
below free_area_cache the search is started from the base instead.

The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
(earlier posted) leakme.c test program terminates after 50000+ iterations
with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
(as expected) with thread creation, Ingo's test_str02 with 20000 threads
requires 0.7s system time.

Taking out Ingo's patch (un-patch available per request) by basically
deleting all mentions of free_area_cache from the kernel and starting the
search for new memory always at the respective bases we observe: leakme
terminates successfully with 11 distinctive hardly fragmented areas in
/proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
time for Ingo's test_str02 with 20000 threads.

Now - drumroll ;-) the appended patch works fine with leakme: it ends with
only 7 distinct areas in /proc/self/maps and also thread creation seems
sufficiently fast with 0.71s for 20000 threads.

Signed-off-by: Wolfgang Wander <wwc@rentec.com>
Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:16 -07:00