[linux-audio-dev] [i686] xmm regs + gcc inline assembly

Tim Goetze tim at quitte.de
Fri Feb 13 00:38:25 UTC 2004


Simon Jenkins wrote:

>I can definitely get
>
>    asm ("movaps %%xmm1 %0" : "=m" (t[0]));
>
>to exhibit the optimisation problem (the one I couldn't get your
>original line to show) and then fix it again by removing the [0].
>
>I was getting a segfault on about 50% of compiles, as I modified
>the code, because the array was being aligned to 8 byte boundaries
>but not to 16 bytes. Declaring it as
>
>float t[4] __attribute__ ((aligned(16)));
>
>got rid of those. Note though that this attribute doesn't work for
>automatic variables.

ok, here is a distilled test of how i allocate and use the
instructions:

int main (int argc, char ** argv)
{
  char scratch [128 + 15];
  float f = 2.3;

  int s = (int) scratch;
  s &= 0xF;
  if (s)
    s = 16 - s;
  float * d = (float *) (((char *) scratch) + s);
  fprintf (stderr, "%p\n", d);

  asm ("movss %0, %%xmm0" : : "m" (f));
  asm ("shufps $0, %xmm0, %xmm0");
  asm ("movaps %%xmm0, %0" : "=m" (d[0]));

  printf ("%.2f %.2f %.2f %.2f\n", d[0], d[1], d[2], d[3]);
}

you'll agree that the program should print "2.30 2.30 2.30 2.30".
it does if you use "=m" (d[0]). if you say "=m" (d), it doesn't.

here's what the assembly block compiles to with "=m" (d):

#APP
  movss -148(%ebp), %xmm0
  shufps $0, %xmm0, %xmm0
  movaps %xmm0, -156(%ebp)
#NO_APP

and here's with "=m" (d[0]):

#APP
  movss -148(%ebp), %xmm0
  shufps $0, %xmm0, %xmm0
#NO_APP
  movl -156(%ebp),%eax
#APP
  movaps %xmm0, (%eax)
#NO_APP

so saying "=m" (d) causes xmm0 to be written to &d, not d, as
intended. if &d isn't 128-bit aligned, it will segfault now.
even if it is, that's not where we wanted the numbers from xmm0
to go ...

tim



More information about the Linux-audio-dev mailing list