[LAD] vectorization

Christian Schoenebeck cuse at users.sourceforge.net
Thu Feb 7 21:57:06 UTC 2008


Am Donnerstag, 7. Februar 2008 19:42:46 schrieb Jens M Andreasen:
> It did change (or will do?) with the Penryn Core 2:
>
> The INSERTPS and PINSR instructions read 8, 16 or 32 bits from an x86
> register memory location and insert it into a field in the destination
> register given by an immediate operand, EXTRACTPS and PEXTR read a field
> from the source register and insert it into an x86 register or memory
> location. For example, PEXTRD eax, [xmm0], 1; EXTRACTPS [addr+4*eax],
> xmm1, 1 stores the first field of xmm1 in the address given by the first
> field of xmm0.
>
> http://en.wikipedia.org/wiki/SSE4

I didn't mean the hardware restriction, that is the limited power of current 
SSE instructions. What I meant was the yet (IMO) incomplete implementation of 
vector extensions in gcc.

I would expect to be able to access the cells of a vector in C(++) by pure 
high level instructions and gcc to compile them to e.g. "workaround" SHUFPS 
instruction sequences automatically. But there's no such (or was not such) 
high level access support in gcc at all yet. As soon as you want to be able 
to do things on element / cell level (e.g. also rotating the vector cells), 
you would have to go low level and in this case there's no much sense in 
using gcc's vector extensions at all.

CU
Christian



More information about the Linux-audio-dev mailing list