[LAD] vectorization

Christian Schoenebeck cuse at users.sourceforge.net
Wed Apr 16 07:19:19 UTC 2008

Am Mittwoch, 16. April 2008 02:10:20 schrieb Jens M Andreasen:
> On Tue, 2008-04-15 at 19:45 +0200, Christian Schoenebeck wrote:
> > Yeah, I'm respawning this topic ...
> There is something funny with this benchmark. If we compare your
>   Benchmarking mixdown (WITH coeff):
>   ASM SSE                 : 160 ms <-- faster?
> .. or leave in  C++ as well:
>   Benchmarking mixdown (WITH coeff):
>   pure C++                : 400 ms <-- slower?
>   ASM SSE                 : 170 ms
> .. or take out only the ASM:
> Benchmarking mixdown (WITH coeff):
> pure C++                : 380 ms <-- faster?
> GCC vector extensions   : 160 ms <-- slower?

Yeah, it even gets funnier: you may have noticed that ALLOC_BUFFERS macro. The 
timing results vary dependent on whether you allocate the buffers at runtime 
(memalign()) or use statically allocated buffers at compile time. It could be 
the same over-optimizing issue like you already noticed. Haven't investigated 
it yet.

> Me thinks it is very difficult to predict what -O3 will or will not do.

Yep, but as you already pointed out, the speed relationship between those 3 
solutions is clear, no matter what the absolute timing results are. I also 
flipped the order of the benchmarks and the coarse speed relationship was 
always the same.

But if you're totally sceptical, you could simply move out the mixing 
functions into an own C++ file, compile that object file with maximum 
optimization, and compile the actual benchmark application with just "-O1" or 


More information about the Linux-audio-dev mailing list