Re: [linux-audio-dev] [i686] xmm regs + gcc inline assembly

12 Feb 2004

Simon Jenkins wrote:
...
 I can definitely get
    asm ("movaps %%xmm1 %0" : "=m" (t[0]));
to exhibit the optimisation problem (the one I couldn't get your
original line to show) and then fix it again by removing the [0].
I was getting a segfault on about 50% of compiles, as I modified
the code, because the array was being aligned to 8 byte boundaries
but not to 16 bytes. Declaring it as
float t[4] __attribute__ ((aligned(16)));
got rid of those. Note though that this attribute doesn't work for
automatic variables. 
ok, here is a distilled test of how i allocate and use the
instructions:
int main (int argc, char ** argv)
{
  char scratch [128 + 15];
  float f = 2.3;
  int s = (int) scratch;
  s &= 0xF;
  if (s)
    s = 16 - s;
  float * d = (float *) (((char *) scratch) + s);
  fprintf (stderr, "%p\n", d);
  asm ("movss %0, %%xmm0" : : "m" (f));
  asm ("shufps $0, %xmm0, %xmm0");
  asm ("movaps %%xmm0, %0" : "=m" (d[0]));
  printf ("%.2f %.2f %.2f %.2f\n", d[0], d[1], d[2], d[3]);
}
you'll agree that the program should print "2.30 2.30 2.30 2.30".
it does if you use "=m" (d[0]). if you say "=m" (d), it doesn't.
here's what the assembly block compiles to with "=m" (d):
#APP
  movss -148(%ebp), %xmm0
  shufps $0, %xmm0, %xmm0
  movaps %xmm0, -156(%ebp)
#NO_APP
and here's with "=m" (d[0]):
#APP
  movss -148(%ebp), %xmm0
  shufps $0, %xmm0, %xmm0
#NO_APP
  movl -156(%ebp),%eax
#APP
  movaps %xmm0, (%eax)
#NO_APP
so saying "=m" (d) causes xmm0 to be written to &d, not d, as
intended. if &d isn't 128-bit aligned, it will segfault now.
even if it is, that's not where we wanted the numbers from xmm0
to go ...
tim

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [linux-audio-dev] [i686] xmm regs + gcc inline assembly