linux/fs
Nick Piggin 9d55c369bb fs: implement faster dentry memcmp
The standard memcmp function on a Westmere system shows up hot in
profiles in the `git diff` workload (both parallel and single threaded),
and it is likely due to the costs associated with trapping into
microcode, and little opportunity to improve memory access (dentry
name is not likely to take up more than a cacheline).

So replace it with an open-coded byte comparison. This increases code
size by 8 bytes in the critical __d_lookup_rcu function, but the
speedup is huge, averaging 10 runs of each:

git diff st   user   sys   elapsed  CPU
before        1.15   2.57  3.82      97.1
after         1.14   2.35  3.61      96.8

git diff mt   user   sys   elapsed  CPU
before        1.27   3.85  1.46     349
after         1.26   3.54  1.43     333

Elapsed time for single threaded git diff at 95.0% confidence:
        -0.21  +/- 0.01
        -5.45% +/- 0.24%

It's -0.66% +/- 0.06% elapsed time on my Opteron, so rep cmp costs on the
fam10h seem to be relatively smaller, but there is still a win.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07 17:50:32 +11:00
..
9p fs: dcache per-inode inode alias locking 2011-01-07 17:50:31 +11:00
adfs fs: dcache reduce branches in lookup path 2011-01-07 17:50:28 +11:00
affs fs: dcache per-inode inode alias locking 2011-01-07 17:50:31 +11:00
afs fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
autofs4 fs: rcu-walk aware d_revalidate method 2011-01-07 17:50:29 +11:00
befs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
bfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
btrfs btrfs: provide simple rcu-walk ACL implementation 2011-01-07 17:50:30 +11:00
cachefiles llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
ceph fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
cifs fs: dcache per-inode inode alias locking 2011-01-07 17:50:31 +11:00
coda fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
configfs fs: dcache reduce branches in lookup path 2011-01-07 17:50:28 +11:00
cramfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
debugfs convert get_sb_single() users 2010-10-29 04:16:28 -04:00
devpts convert get_sb_single() users 2010-10-29 04:16:28 -04:00
dlm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm 2010-10-22 17:33:16 -07:00
ecryptfs fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
efs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
exofs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
exportfs fs: dcache per-inode inode alias locking 2011-01-07 17:50:31 +11:00
ext2 ext2,3,4: provide simple rcu-walk ACL implementation 2011-01-07 17:50:30 +11:00
ext3 ext2,3,4: provide simple rcu-walk ACL implementation 2011-01-07 17:50:30 +11:00
ext4 ext2,3,4: provide simple rcu-walk ACL implementation 2011-01-07 17:50:30 +11:00
fat fs: rcu-walk aware d_revalidate method 2011-01-07 17:50:29 +11:00
freevxfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
fscache
fuse fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
gfs2 fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
hfs fs: rcu-walk aware d_revalidate method 2011-01-07 17:50:29 +11:00
hfsplus fs: dcache reduce branches in lookup path 2011-01-07 17:50:28 +11:00
hostfs fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
hpfs fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
hppfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
hugetlbfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
isofs fs: dcache reduce branches in lookup path 2011-01-07 17:50:28 +11:00
jbd Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 2010-10-27 20:13:18 -07:00
jbd2 jbd2: fix /proc/fs/jbd2/<dev> when using an external journal 2010-11-17 21:46:26 -05:00
jffs2 fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
jfs fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
lockd BKL: remove extraneous #include <smp_lock.h> 2010-11-17 08:59:32 -08:00
logfs fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
minix fs: dcache reduce branches in lookup path 2011-01-07 17:50:28 +11:00
ncpfs fs: rcu-walk aware d_revalidate method 2011-01-07 17:50:29 +11:00
nfs fs: dcache per-inode inode alias locking 2011-01-07 17:50:31 +11:00
nfs_common
nfsd fs: dcache scale dentry refcount 2011-01-07 17:50:21 +11:00
nilfs2 fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
nls
notify fs: dcache per-inode inode alias locking 2011-01-07 17:50:31 +11:00
ntfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
ocfs2 fs: dcache per-inode inode alias locking 2011-01-07 17:50:31 +11:00
omfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
openpromfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
partitions Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block 2010-10-25 07:45:10 -07:00
proc fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
qnx4 fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
quota quota: Fix possible oops in __dquot_initialize() 2010-10-28 01:30:06 +02:00
ramfs convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
reiserfs fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
romfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
squashfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
sysfs fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
sysv fs: dcache reduce branches in lookup path 2011-01-07 17:50:28 +11:00
ubifs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
udf fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
ufs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
xfs xfs: provide simple rcu-walk ACL implementation 2011-01-07 17:50:30 +11:00
aio.c new helper: ihold() 2010-10-25 21:26:11 -04:00
anon_inodes.c fs: improve scalability of pseudo filesystems 2011-01-07 17:50:32 +11:00
attr.c
bad_inode.c fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
binfmt_aout.c Don't dump task struct in a.out core-dumps 2010-10-14 10:57:40 -07:00
binfmt_elf_fdpic.c
binfmt_elf.c ARM: 6342/1: fix ASLR of PIE executables 2010-10-08 10:02:53 +01:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c convert get_sb_single() users 2010-10-29 04:16:28 -04:00
binfmt_script.c
binfmt_som.c
bio-integrity.c
bio.c bio: take care not overflow page count when mapping/copying user data 2010-11-10 14:40:43 +01:00
block_dev.c fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
buffer.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-10-26 17:58:44 -07:00
char_dev.c Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
compat_binfmt_elf.c
compat_ioctl.c BKL: remove extraneous #include <smp_lock.h> 2010-11-17 08:59:32 -08:00
compat.c exec: copy-and-paste the fixes into compat_do_execve() paths 2010-11-30 17:56:38 -08:00
dcache.c fs: implement faster dentry memcmp 2011-01-07 17:50:32 +11:00
dcookies.c
direct-io.c fs/direct-io.c: fix truncation error in dio_complete() return 2010-10-26 16:52:13 -07:00
drop_caches.c
eventfd.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
eventpoll.c epoll: make epoll_wait() use the hrtimer range feature 2010-10-27 18:03:18 -07:00
exec.c install_special_mapping skips security_file_mmap check. 2010-12-15 12:30:36 -08:00
fcntl.c fasync: Fix placement of FASYNC flag comment 2010-10-27 18:17:02 -07:00
fifo.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
file_table.c fs: allow for more than 2^31 files 2010-10-26 16:52:15 -07:00
file.c
filesystems.c fs: rcu-walk for path lookup 2011-01-07 17:50:27 +11:00
fs_struct.c fs: fs_struct use seqlock 2011-01-07 17:50:27 +11:00
fs-writeback.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable 2010-10-30 09:05:48 -07:00
generic_acl.c fs: provide simple rcu-walk generic_check_acl implementation 2011-01-07 17:50:29 +11:00
inode.c fs: avoid inode RCU freeing for pseudo fs 2011-01-07 17:50:26 +11:00
internal.h braino in internal.h 2010-10-29 05:49:13 -04:00
ioctl.c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2010-11-19 19:46:45 -08:00
ioprio.c ioprio: grab rcu_read_lock in sys_ioprio_{set,get}() 2010-11-15 10:23:31 +01:00
Kconfig Merge 'staging-next' to Linus's tree 2010-10-28 09:44:56 -07:00
Kconfig.binfmt coredump: default CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y 2010-10-27 18:03:12 -07:00
libfs.c fs: dcache reduce branches in lookup path 2011-01-07 17:50:28 +11:00
locks.c fs: dcache scale dentry refcount 2011-01-07 17:50:21 +11:00
Makefile Merge 'staging-next' to Linus's tree 2010-10-28 09:44:56 -07:00
mbcache.c
mpage.c
namei.c fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
namespace.c fs: dcache remove d_mounted 2011-01-07 17:50:28 +11:00
nfsctl.c
no-block.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
open.c fix open/umount race 2010-10-29 04:14:56 -04:00
pipe.c fs: improve scalability of pseudo filesystems 2011-01-07 17:50:32 +11:00
pnode.c
pnode.h
posix_acl.c
read_write.c BKL: remove extraneous #include <smp_lock.h> 2010-11-17 08:59:32 -08:00
read_write.h
readdir.c
select.c epoll: make epoll_wait() use the hrtimer range feature 2010-10-27 18:03:18 -07:00
seq_file.c fs: take dcache_lock inside __d_path 2010-10-25 21:26:12 -04:00
signalfd.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2010-10-26 10:13:10 -07:00
splice.c Export 'get_pipe_info()' to other users 2010-11-28 14:09:57 -08:00
stack.c
stat.c
statfs.c
super.c fs: dcache per-bucket dcache hash locking 2011-01-07 17:50:31 +11:00
sync.c
timerfd.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
utimes.c
xattr_acl.c
xattr.c