linux/arch/s390
Christian Borntraeger fb3d1c085c s390: let the compiler do page clearing
The hardware folks told me that for page clearing "when you exactly
know what to do, hand written xc+pfd is usally faster then mvcl for
page clearing, as it saves millicode overhead and parameter parsing
and checking" as long as you dont need the cache bypassing.
Turns out that gcc already does a proper xc,pfd loop.

A small test on z196 that does

buff = mmap(NULL, bufsize,PROT_EXEC|PROT_WRITE|PROT_READ,AP_PRIVATE| MAP_ANONYMOUS,0,0);
for ( i = 0; i < bufsize; i+= 256)
    buff[i] = 0x5;

gets 20% faster (touches every cache line of a page)

and

buff = mmap(NULL, bufsize,PROT_EXEC|PROT_WRITE|PROT_READ,AP_PRIVATE| MAP_ANONYMOUS,0,0);
for ( i = 0; i < bufsize; i+= 4096)
    buff[i] = 0x5;

is within noise ratio (touches one cache line of a page).

As the clear_page is usually called for first memory accesses
we can assume that at least one cache line is used afterwards,
so this change should be always better.
Another benchmark, a make -j 40 of my testsuite in tmpfs with
hot caches on a 32cpu system:

 -- unpatched --       --  patched  --
real     0m1.017s     real     0m0.994s   (~2% faster, but in noise)
user     0m5.339s     user     0m5.016s   (~6% faster)
sys      0m0.691s     sys      0m0.632s   (~8% faster)

Let use the same define to memset as the asm-generic variant

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-02-26 09:24:49 +01:00
..
appldata s390: appldata: drop owner assignment from platform_drivers 2014-10-20 16:20:13 +02:00
boot s390/sclp: fix declaration of _sclp_print_early() 2015-01-08 10:02:51 +01:00
configs s390: update default configuration 2015-01-22 12:16:09 +01:00
crypto s390/crypto: remove 'const' to avoid compiler warnings 2015-01-08 10:02:53 +01:00
hypfs VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) 2015-02-22 11:38:41 -05:00
include s390: let the compiler do page clearing 2015-02-26 09:24:49 +01:00
kernel s390/jump label: improve and fix sanity check 2015-02-26 09:24:46 +01:00
kvm Fairly small update, but there are some interesting new features. 2015-02-13 09:55:09 -08:00
lib s390/spinlock: add compare-and-delay to lock wait loops 2015-01-23 15:17:04 +01:00
math-emu s390: fix save and restore of the floating-point-control register 2013-10-24 17:17:11 +02:00
mm s390/mm: align 64-bit PIE binaries to 4GB 2015-02-19 10:36:32 +01:00
net s390/bpf: Zero extend parameters before calling C function 2015-01-15 11:10:41 +01:00
oprofile s390: Replace __get_cpu_var uses 2014-08-26 13:45:52 -04:00
pci s390/pci: fix possible information leak in mmio syscall 2015-02-26 09:24:48 +01:00
defconfig s390: update default configuration 2015-01-22 12:16:09 +01:00
Kbuild
Kconfig s390/smp: increase maximum value of NR_CPUS to 512 2015-01-30 09:31:13 +01:00
Kconfig.debug Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS 2013-04-30 17:04:09 -07:00
Makefile s390/ftrace: hotpatch support for function tracing 2015-01-29 09:19:25 +01:00