Re: [linux-audio-dev] [i686] xmm regs + gcc inline assembly

13 Feb 2004

Tim Goetze wrote:
...
 Simon Jenkins wrote:
 I can definitely get
   asm ("movaps %%xmm1 %0" : "=m" (t[0]));
to exhibit the optimisation problem (the one I couldn't get your
original line to show) and then fix it again by removing the [0].
[snip]

ok, here is a distilled test of how i allocate and use the
instructions:
int main (int argc, char ** argv)
{
  char scratch [128 + 15];
  float f = 2.3;
  int s = (int) scratch;
  s &= 0xF;
  if (s)
    s = 16 - s;
  float * d = (float *) (((char *) scratch) + s);
  fprintf (stderr, "%p\n", d);
  asm ("movss %0, %%xmm0" : : "m" (f));
  asm ("shufps $0, %xmm0, %xmm0");
  asm ("movaps %%xmm0, %0" : "=m" (d[0]));
  printf ("%.2f %.2f %.2f %.2f\n", d[0], d[1], d[2], d[3]);
}
you'll agree that the program should print "2.30 2.30 2.30 2.30".
it does if you use "=m" (d[0]). if you say "=m" (d), it doesn't.
here's what the assembly block compiles to with "=m" (d):
#APP
  movss -148(%ebp), %xmm0
  shufps $0, %xmm0, %xmm0
  movaps %xmm0, -156(%ebp)
#NO_APP
and here's with "=m" (d[0]):
#APP
  movss -148(%ebp), %xmm0
  shufps $0, %xmm0, %xmm0
#NO_APP
  movl -156(%ebp),%eax
#APP
  movaps %xmm0, (%eax)
#NO_APP
so saying "=m" (d) causes xmm0 to be written to &d, not d, as
intended. if &d isn't 128-bit aligned, it will segfault now.
even if it is, that's not where we wanted the numbers from xmm0
to go ...
 The discrepency here is because you originally said you were trying to
get the data into a named array of floats:
    float t[4];
but it turns out you're actually trying to get them into some memory
to which you have a named pointer:
  float *d;
Now, there are a great many circumstances in which you could treat
such names interchangeably, but this isn't one of them.
The following code demonstrates
  asm ("movaps %%xmm0, %0" : "=m" (d));
working correctly if d is an aligned array of floats. Also, if
you change the d to d[0], it exhibits the optimization problem.
/* start */
float d[4] __attribute__ ((aligned(16))) = { 1.1f, 1.1f, 1.1f, 1.1f };
int main (int argc, char ** argv)
{
  float z = 1.1f;
  float f = 2.3f;
  z += 3.3f;
  asm ("movss %0, %%xmm0" : : "m" (f));
  asm ("shufps $0, %xmm0, %xmm0");
  asm ("movaps %%xmm0, %0" : "=m" (d));
  z += d[1];
  printf ("%.2f %.2f %.2f %.2f\n", d[0], d[1], d[2], d[3]);
  printf ("z is %.2f\n", z );
}
/* end */
We're expecting (and we get):
2.30 2.30 2.30 2.30
z is 6.70
but using d[0] instead of d we end up getting:
2.30 2.30 2.30 2.30
z is 5.50
Simon Jenkins
(Bristol, UK)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [linux-audio-dev] [i686] xmm regs + gcc inline assembly