linux

iv/linux

History

Eric Dumazet 6a2d7a955d [PATCH] SLAB: use a multiply instead of a divide in obj_to_index() When some objects are allocated by one CPU but freed by another CPU we can consume lot of cycles doing divides in obj_to_index(). (Typical load on a dual processor machine where network interrupts are handled by one particular CPU (allocating skbufs), and the other CPU is running the application (consuming and freeing skbufs)) Here on one production server (dual-core AMD Opteron 285), I noticed this divide took 1.20 % of CPU_CLK_UNHALTED events in kernel. But Opteron are quite modern cpus and the divide is much more expensive on oldest architectures : On a 200 MHz sparcv9 machine, the division takes 64 cycles instead of 1 cycle for a multiply. Doing some math, we can use a reciprocal multiplication instead of a divide. If we want to compute V = (A / B) (A and B being u32 quantities) we can instead use : V = ((u64)A * RECIPROCAL(B)) >> 32 ; where RECIPROCAL(B) is precalculated to ((1LL << 32) + (B - 1)) / B Note : I wrote pure C code for clarity. gcc output for i386 is not optimal but acceptable : mull 0x14(%ebx) mov %edx,%eax // part of the >> 32 xor %edx,%edx // useless mov %eax,(%esp) // could be avoided mov %edx,0x4(%esp) // useless mov (%esp),%ebx [akpm@osdl.org: small cleanups] Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>		2006-12-13 09:05:49 -08:00
..
allocpercpu.c	[PATCH] Allow NULL pointers in percpu_free	2006-12-07 08:39:22 -08:00
backing-dev.c	[PATCH] separate bdi congestion functions from queue congestion functions	2006-10-20 10:26:35 -07:00
bootmem.c	[PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols	2006-12-07 08:39:44 -08:00
bounce.c	[PATCH] BLOCK: Separate the bounce buffering code from the highmem code [try #6 ]	2006-09-30 20:32:11 +02:00
fadvise.c	[PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path	2006-12-08 08:28:43 -08:00
filemap_xip.c	[PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path	2006-12-08 08:28:43 -08:00
filemap.c	[PATCH] dio: only call aio_complete() after returning -EIOCBQUEUED	2006-12-10 09:57:21 -08:00
filemap.h	Remove all inclusions of <linux/config.h>	2006-10-04 03:38:54 -04:00
fremap.c	[PATCH] kill install_file_pte's pte_val	2006-12-07 08:39:23 -08:00
highmem.c	[PATCH] BLOCK: Separate the bounce buffering code from the highmem code [try #6 ]	2006-09-30 20:32:11 +02:00
hugetlb.c	[PATCH] cpuset: rework cpuset_zone_allowed api	2006-12-13 09:05:49 -08:00
internal.h	[PATCH] mm: VM_BUG_ON	2006-09-26 08:48:44 -07:00
Kconfig	Fix "can not" in Documentation and Kconfig	2006-10-03 22:53:09 +02:00
madvise.c	[PATCH] Fix MADV_REMOVE protection checking	2006-04-17 18:22:18 -07:00
Makefile	[PATCH] separate bdi congestion functions from queue congestion functions	2006-10-20 10:26:35 -07:00
memory_hotplug.c	[PATCH] Get rid of zone_table[]	2006-12-07 08:39:20 -08:00
memory.c	[PATCH] read_zero_pagealigned() locking fix	2006-12-10 09:55:39 -08:00
mempolicy.c	[PATCH] struct path: convert mm	2006-12-08 08:28:47 -08:00
mempool.c	[PATCH] dm: work around mempool_alloc, bio_alloc_bioset deadlocks	2006-09-01 11:39:09 -07:00
migrate.c	[PATCH] radix-tree: RCU lockless readside	2006-12-07 08:39:25 -08:00
mincore.c	[PATCH] freepgt: sys_mincore ignore FIRST_USER_PGD_NR	2005-04-19 13:29:20 -07:00
mlock.c	[PATCH] mlock cleanup	2006-12-07 08:39:22 -08:00
mmap.c	[PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path	2006-12-08 08:28:43 -08:00
mmzone.c	[PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols	2006-12-07 08:39:44 -08:00
mprotect.c	[PATCH] paravirt: lazy mmu mode hooks.patch	2006-10-01 00:39:33 -07:00
mremap.c	[PATCH] paravirt: lazy mmu mode hooks.patch	2006-10-01 00:39:33 -07:00
msync.c	[PATCH] mm: msync() cleanup	2006-09-26 08:48:45 -07:00
nommu.c	[PATCH] struct path: convert mm	2006-12-08 08:28:47 -08:00
oom_kill.c	[PATCH] cpuset: rework cpuset_zone_allowed api	2006-12-13 09:05:49 -08:00
page_alloc.c	[PATCH] cpuset: rework cpuset_zone_allowed api	2006-12-13 09:05:49 -08:00
page_io.c	[PATCH] swsusp: use block device offsets to identify swap locations	2006-12-07 08:39:27 -08:00
page-writeback.c	[PATCH] io-accounting: write accounting	2006-12-10 09:55:41 -08:00
pdflush.c	[PATCH] Add include/linux/freezer.h and move definitions from sched.h	2006-12-07 08:39:27 -08:00
prio_tree.c	Linux-2.6.12-rc2	2005-04-16 15:20:36 -07:00
readahead.c	[PATCH] io-accounting-read-accounting nfs fix	2006-12-10 09:55:41 -08:00
rmap.c	[PATCH] mm: more commenting on lock ordering	2006-10-20 10:26:44 -07:00
shmem_acl.c	[PATCH] Fix typos in mm/shmem_acl.c	2006-10-11 11:14:23 -07:00
shmem.c	[PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path	2006-12-08 08:28:43 -08:00
slab.c	[PATCH] SLAB: use a multiply instead of a divide in obj_to_index()	2006-12-13 09:05:49 -08:00
slob.c	[PATCH] More slab.h cleanups	2006-12-13 09:05:49 -08:00
sparse.c	[PATCH] numa node ids are int, page_to_nid and zone_to_nid should return int	2006-12-07 08:39:23 -08:00
swap_state.c	[PATCH] lockdep: locking init debugging improvement	2006-07-03 15:27:02 -07:00
swap.c	[PATCH] hotplug CPU: clean up hotcpu_notifier() use	2006-12-07 08:39:39 -08:00
swapfile.c	[PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path	2006-12-08 08:28:43 -08:00
thrash.c	[PATCH] make mm/thrash.c:global_faults static	2006-12-07 08:39:22 -08:00
tiny-shmem.c	[PATCH] struct path: convert mm	2006-12-08 08:28:47 -08:00
truncate.c	[PATCH] io-accounting: write-cancel accounting	2006-12-10 09:55:41 -08:00
util.c	[PATCH] slab: clean up leak tracking ifdefs a little bit	2006-10-04 07:55:13 -07:00
vmalloc.c	[PATCH] Fix strange size check in __get_vm_area_node()	2006-11-16 11:43:38 -08:00
vmscan.c	[PATCH] cpuset: rework cpuset_zone_allowed api	2006-12-13 09:05:49 -08:00
vmstat.c	[PATCH] struct seq_operations and struct file_operations constification	2006-12-07 08:39:46 -08:00