[linux-audio-dev] Traps in floating point code

Jussi Laako jussi.laako at pp.inet.fi
Fri Jul 2 19:14:04 UTC 2004


On Fri, 2004-07-02 at 00:40, Erik de Castro Lopo wrote:

> > Eric what do you think ? can something like that be coded efficiently 
> > using SSE/SSE2 ?
> 
> Probably not. There are some algorithms which simply can't be vectorized.

SSE2 is usually significantly faster for non-vectorized code also. At
least for P4 and AMD64. I usually do some profiling on code generated by
the compiler and then handcode the SSE2 parts for compiler bottlenecks.

IIR filter was one good example where compilers sucked badly.


-- 
Jussi Laako <jussi.laako at pp.inet.fi>




More information about the Linux-audio-dev mailing list