On Wed, 2008-04-16 at 14:01 +0200, Christian Schoenebeck wrote:
Am Mittwoch, 16. April 2008 12:17:23 schrieb Jens M
Andreasen:
Then you will probably like this run of
./mixdown:
Benchmarking mixdown (WITH coeff):
pure C++ : 150 ms <-- pretty good, eh? ;-)
ASM SSE : 140 ms
GCC vector extensions : 150 ms
This is with -ftree-vectorize added to your compile options.
Hey, nice! :-)
But that works only well for the inlined version. If the mixer C(++) function
is located in another object file and the compiler is forced to use real
function calls, then the C(++) result is still worse than the gcc vector
version, even with "-ftree-vectorize":
It is not the function call. Seems more like the vectorizer gets lost?
You can get all kinds of information with -ftree-vectorizer-verbose=5
--8<-------------------------------------------------------
cppmix.cpp:27: note: dependence distance = 0.
cppmix.cpp:27: note: accesses have the same alignment.
cppmix.cpp:27: note: dependence distance modulo vf == 0 between
*D.2232_8 and *D.2232_8
cppmix.cpp:27: note: not vectorized: can't determine dependence between
*D.2235_18 and *D.2232_8
cppmix.cpp:26: note: vectorized 0 loops in function.
.. unfortunately, I have no idea excactly what to do with the
information?
Moving the cpp code back to the main file and adding
-funsafe-math-optimizations proves a lot more interresting:
Benchmarking mixdown (no coeff):
pure C++ : 100 ms
ASM SSE : 140 ms
GCC vector extensions : 120 ms
Benchmarking mixdown (WITH coeff):
pure C++ : 120 ms
ASM SSE : 150 ms
GCC vector extensions : 170 ms
One more time for those who missed it the first time, this time
commenting out everything but 'pure cpp w coff':
Benchmarking mixdown (WITH coeff):
pure C++ : 120 ms <-- bloody murder!