Steve Harris <S.W.Harris(a)ecs.soton.ac.uk>uk>, on Mon Feb 17, 2003 [09:20:26 AM] said:
On Mon, Feb 17, 2003 at 08:34:10 +0100, Roger Larsson
wrote:
Testing that i have done suggest that you should
ALWAYS define the
architecture of your target. (I have not checked if the example does this
but it is usually forgot...)
-march=pentium3
or if you need it to run on older computers
-mcpu=pentium3
(see man gcc search for "Intel 386")
And at least use optimization level -O1 (use -O3 to get automatic inlining)
Agreed. I generally use -O6, its not a good idea for general use, but for
small fast plugin code its often the fastest. My full set of flags is:
-O6 -fomit-frame-pointer -fstrength-reduce -funroll-loops
-fmove-all-movables -ffast-math ${MACHINE_SPECIFIC}
I'd be interested to see how this compares with other peoples. I've been
meaning to write a test script that tries all the combinations to find
which produces the fastest code.
The -march, -mcpu flags are the mostimportant (they go in
MACHINE_SPECIFIC), but -ffast-math can help a lot too.
- Steve
Hi;
Here is an article and some discussion of these things:
http://freshmeat.net/articles/view/730/
Unfortunatley, it is hard to make a hard and fast rule
as to what are the optimal compilation switchs; you cant be
sure until you actually test and compare on a specific piece
of software. (or maybe a class-- like a 'small fast plugin')
eg. -O3, loop unrolling, and alignment for architecture can
bloat up code and _might_ result in critical stuff not fitting
in cache. (Ive heard claims about stuff that ran faster with
-Os than -O3) Perhaps a set of switchs appropriate to the
concerns of audio developers could be generalized.
Im barely even a novice in this area, though.:)
Paul
set(a)pobox.com