[LAD] GCC Vector extensions

Stéphane Letz letz at grame.fr
Mon Jul 25 13:33:04 UTC 2011


> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 25 Jul 2011 12:04:06 +0200
> From: Maurizio De Cecco <jmax at dececco.name>
> Subject: Re: [LAD] GCC Vector extensions
> Cc: linux-audio-dev at lists.linuxaudio.org
> Message-ID: <4E2D3F96.2010206 at dececco.name>
> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
> 
> Short resume of my initial post: i found that using the gcc vector 
> extensions induced a 2x slow down using gcc, and a 4x speed up in clang.
> 
> I made more tests, isolating a small code example, on Mac OS and Ubuntu, 
> and i found out the origin of the problem, even if i do not know what 
> exactly happening.
> 
> My original test used vectors of float of size 8; the gcc vector 
> extension documentation says that if the vector size do not match the 
> hardware vector size, the code is synthesized in some way.
> 
> With a vector size of 8 i found the above results under Mac OS X, using 
> clang and gcc4.2, and under Ubuntu 11.04, using clang and gcc4.5.2.
> 
> When i move to a vector size of 4, things go better; clang slow down wrt 
> the size of 8 of around 2x, and gcc obtains the same result; the 
> interesting point is that gcc obtains essentially the same speed with
> and without vector extensions, meaning probably that the compiler is 
> good enough in vectorizing the code, at least in the the test cases i used.
> 
> I include the code, results and scripts to run the tests in a small zip
> if anybody want to make other tests; the test code compute an arbitrary 
> vector computation (essentially 100 million multiply add), starting from 
> a seed given as argument.
> 
> The code is modeled around the way jmax compute, i.e. one vector 
> operation at a time on vectors passed by pointers, and it is not 
> designed to be the fastest possible code to implements this computation.
> 
> Thanks for the help,
> 
> Maurizio


You should really give the produced assembly code in each test case.

Stephane



More information about the Linux-audio-dev mailing list