Jussi Laako wrote:
N=1024, n=1000000, gcc:
> clock: 16500 ms (_Complex)
> clock: 26760 ms (cvec_t)
> clock: 15820 ms (original float array[N][2])
> clock: 13700 ms (asm on float array)
And the above case without vectorization (except the asm version):
clock: 19730 ms (_Complex)
clock: 66840 ms (cvec_t)
clock: 18260 ms (original float array[N][2])
clock: 13680 ms (asm on float array)
I would say there's a "slight" variation for the cvec_t case...
- Jussi