[linux-audio-dev] Intel C Compiler & RedHat 8.0 , Pentium 4 FPU performance

Steve Harris S.W.Harris at ecs.soton.ac.uk
Tue Nov 12 12:20:01 UTC 2002


On Tue, Nov 12, 2002 at 08:23:50 -0800, Bob Colwell wrote:
> >Yes, you have to specify the use of sse explicity (I think I meantioned it
> >on IRC when we were benchmarking). It appeared to make zero difference on
> >the athlon, but I didn't check the assemler to see exactly what it was
> >doing. I've heard that just using sse instructions instead of 387 on the
> >P4 is quicker, but I've not tried it. Gcc will do that if you specify -msse
> 
> The sse instructions ought to be substantially faster. There are many more
> registers available to support the flops, and they aren't organized into the
> ridiculous 387 stack, so they're easier to reach. I believe they also
> default
> to round-to-nearest and flush-denormals, but if you care about such niceties
> you should check.

Yes, that is correct AKAIK. However the particular benchmark we are
talking about has a lot of memory access in it, and in that case it didn't
make any difference (or gcc was doing something stupid, I didn't check).

I dont understand processor issues well enough to know what the
bottleneck would be, but it doesn't appear to be the maths instrucitons.

I will check the effect of sse on my plugins as they are generally less
ram hungry, but I dont have a gcc3 machine around at the moment.

- Steve



More information about the Linux-audio-dev mailing list