[LAD] vectorization

Jussi Laako jussi at sonarnerd.net
Mon May 5 18:55:28 UTC 2008


Jens M Andreasen wrote:
> Could you try this out with your proposed compiler options on your own
> hardware?

I also added my own asm flavor...

With "-O3 -msse3 -ffast-math -ftree-vectorize -fprefetch-loop-arrays" on 
gcc:
 > clock: 15740 ms (_Complex)
 > clock: 24930 ms (cvec_t)
 > clock: 17770 ms (original float array[N][2])
 > clock: 13660 ms (asm on float array)

With "-O3 -xO -fp-model fast" on icc (all variants vectorized):
 > clock: 1030 ms (_Complex)
 > clock: 520 ms (cvec_t)
 > clock: 16250 ms (original float array[N][2])


> (although it seems to shine when paired up with icc :)

icc is very nice... :)


	- Jussi



More information about the Linux-audio-dev mailing list