Christian Schoenebeck wrote:
I compared a pure C++ implementation vs. the hand
crafted SSE assembly code
(by Sampo Savolainen, Ardour) and of course an implementation utilizing GCC's
vector extensions. On my very weak, but environment friendly ;-) VIA box the
For simple operations, compilers are rather good on vectorization. Even
though I don't know if there's any support for multi-arch targets on
gcc, so that the SSE2/SSE3 optimized binary would run on hardware
without SSE (dynamic code selection)? I haven't got time to follow the
latest gcc developments.
For more complex operations like FIR, IIR, normalized cross-correlation
or complex multiply-accumulate, I haven't seen any compiler being able
to match hand-crafted assembly code.
BR,
- Jussi Laako