Alexey Tourbin
0e8ab4e05c
rpmrc.in: use -mtune=generic instead of -mtune=pentium4 for i[3456]86
We use i586 as our default generic arch for x86 processors. But -mtune=pentium4 is preferable only for Intel processors, and possibly disadvantageous for AMD chips. I suggest we use -mtune=generic instead. Here is what "man gcc" says about -mtune=generic: Produce code optimized for the most common IA32/AMD64/EM64T processors. If you know the CPU on which your code will run, then you should use the corresponding -mtune option instead of -mtune=generic. But, if you do not know exactly what CPU users of your application will have, then you should use this option. As new processors are deployed in the marketplace, the behavior of this option will change. Therefore, if you upgrade to a newer version of GCC, the code generated option will change to reflect the processors that were most common when that version of GCC was released. Now if you're willing to take a look at gcc/gcc/config/i386/i386.c, you can see that -mtune= option affects only "instruction costs". For example, AMD chips take fewer cycles to execute some divide/mod instructions than Intel processors. Instruction costs can affect peephole optimizer or something to make the resulting instruction sequence take fewer cycles. It appears that "generic32_cost" provides reasonable compromise so that the resulting code runs quite well on all modern CPUs. Update. I've been requested to provide some numbers. I use perlbench-0.93 suite to measure libperl.so performance. A) libperl.so compiled with -march=i586 -mtune=pentium4 B) libperl.so compiled with -march=i586 -mtune=generic AMD Athlon 64 A B ------------- --- --- arith/mixed 100 106 arith/trig 100 100 array/copy 100 104 array/foreach 100 94 array/index 100 108 array/pop 100 109 array/shift 100 107 array/sort-num 100 103 array/sort 100 100 call/0arg 100 105 call/1arg 100 96 call/2arg 100 101 call/9arg 100 107 call/empty 100 108 call/fib 100 103 call/method 100 106 call/wantarray 100 107 hash/copy 100 99 hash/each 100 91 hash/foreach-sort 100 96 hash/foreach 100 100 hash/get 100 102 hash/set 100 110 loop/for-c 100 104 loop/for-range-const 100 102 loop/for-range 100 103 loop/getline 100 106 loop/while-my 100 109 loop/while 100 113 re/const 100 104 re/w 100 102 startup/fewmod 100 104 startup/lotsofsub 100 107 startup/noprog 100 100 string/base64 100 102 string/htmlparser 100 102 string/index-const 100 110 string/index-var 100 74 string/ipol 100 105 string/tr 100 102 AVERAGE 100 103 Intel Xeon A B ---------- --- --- arith/mixed 100 98 arith/trig 100 138 array/copy 100 101 array/foreach 100 100 array/index 100 94 array/pop 100 99 array/shift 100 117 array/sort-num 100 103 array/sort 100 105 call/0arg 100 101 call/1arg 100 97 call/2arg 100 93 call/9arg 100 98 call/empty 100 100 call/fib 100 116 call/method 100 92 call/wantarray 100 101 hash/copy 100 104 hash/each 100 102 hash/foreach-sort 100 102 hash/foreach 100 98 hash/get 100 102 hash/set 100 96 loop/for-c 100 128 loop/for-range-const 100 100 loop/for-range 100 103 loop/getline 100 94 loop/while-my 100 107 loop/while 100 102 re/const 100 99 re/w 100 92 startup/fewmod 100 101 startup/lotsofsub 100 98 startup/noprog 100 101 string/base64 100 100 string/htmlparser 100 70 string/index-const 100 103 string/index-var 100 101 string/ipol 100 105 string/tr 100 94 AVERAGE 100 101 Look ma, I've got about 3% performance boost on Athlon64 and even some minor improvement on Intel Xeon! Also notice that, on Xeon, the numbers are more diverse. I believe that the numbers prove that, compared to -mtune=pentium4, -mtune=generic is beneficial for Athlon64 and at least makes no harm for Xeon. Here is how to run perlbench: $ echo ${PWD##*/} perlbench-0.93 $ cat perl1 perl2 LD_LIBRARY_PATH=$PWD/lib1 exec /usr/bin/perl "$@" LD_LIBRARY_PATH=$PWD/lib2 exec /usr/bin/perl "$@" $ ls -l lib?/libperl* -rw-r--r-- 1 at at 1173944 Jan 9 03:42 lib1/libperl.so.5.8 -rw-r--r-- 1 at at 1204984 Jan 9 03:46 lib2/libperl.so.5.8 $ ./perlbench-run ./perl1 ./perl2 ...
This is RPM, the Red Hat Package Manager. The latest releases are always available at: ftp://ftp.rpm.org/pub/rpm Additional RPM documentation (papers, slides, HOWTOs) can also be found at the same site, as well as http://www.rpm.org. There is a mailing list for discussion of RPM issues, rpm-list@redhat.com. To subscribe, send a message to rpm-list-request@redhat.com with the word "subscribe" in the subject line. RPM was originally written by: Erik Troan <ewt@redhat.com> Marc Ewing <marc@redhat.com> See the CREDITS file for a list of folks who have helped us out tremendously. RPM is Copyright (c) 1998 by Red Hat Software, Inc., and may be distributed under the terms of the GPL and LGPL (see the file COPYING for details).
Description
Languages
C
88.7%
Shell
8.9%
M4
1.3%
Makefile
1%