[LAD] GCC Vector extensions

Gabriel Beddingfield gabrbedd at gmail.com
Mon Jul 25 15:17:15 UTC 2011


On Mon, Jul 25, 2011 at 5:04 AM, Maurizio De Cecco <jmax at dececco.name> wrote:
> Short resume of my initial post: i found that using the gcc vector
> extensions induced a 2x slow down using gcc, and a 4x speed up in clang.
[snip]
>
> I include the code, results and scripts to run the tests in a small zip
> if anybody want to make other tests; the test code compute an arbitrary
> vector computation (essentially 100 million multiply add), starting from a
> seed given as argument.

I'm getting SIMD instructions when I compile.  However, you have two
things slowing you down:

  - The calculations for the for(;;) loop is slowing you down with
every iteration.
  - You're only using one xmm register, so you're getting some memory slowdowns.

Both of these can be solved by having gcc unroll your loops for you
(recompile with -funroll-loops).

In addition, you're handling 3 buffers at a time.  bufc[k] = bufa[k] *
bufb[k].  You might be able to speed it up a little by converting the
code to:

    memcopy(bufc, bufa, N*sizeof(float));
    for(k=0; k<N ; ++k) bufc[k] *= bufb[k];

This way you are only handling 2 buffers at a time (which an x86 CPU
generally does better with).  But YMMV on this piece of advice.

-gabriel



More information about the Linux-audio-dev mailing list