Re: [linux-audio-dev] Traps in floating point code

1 Jul 2004

On Thursday 01 July 2004 14:41, Tim Goetze wrote:
...
  [Ruben van Royen]
 please note that SSE2 has support for 64bit floats
(doubles) and contains
 an instruction that truncates to int, irregardless of controlwords. A new
 enough gcc with (-march=pentium4 or -msse2) and -mfpmath=sse will use sse
 instead of the old fp unit. This has more advantages, since sse math uses
 normal registers instead of the stack in the old fp unit.
The disadvantage is of course that it does not run on older processors.
 I'm also not sure what level of sse athlon currently supports. The last
 time I looked, it only supported sse. This is also good, but it lacks
 support for double precision floatingpoint. 
 afaik, the athlon XP here only has SSE (not ~2), but the instruction
 set includes this (quote taken from the NASM documentation, section
 B.4):
 CVTTSD2SI reg32,xmm/mem32      ; F3 0F 2C /r    [KATMAI,SSE] 
Yes, that one is part of sse. SSE2 adds a 64bit variant, so it also works with
doubles.
...

 CVTTSS2SI converts a single-precision FP value in the source operand
 to a signed doubleword in the destination operand. If the result is
 inexact, it is truncated (rounded toward zero).
 The destination operand is a general purpose register. The source can
 be either an XMM register or a 32-bit memory location. If the source
 is a register, the input value is in the low doubleword.
 -
 the operand requirements are quite different from "fistpl" so
 replacing one with the other requires some additional instructions
 to move the data around. 
If you must first move the data from an FP register to an XMM register, it is
not very likely that you will get a performance improvement. The route to go
would be to do all calculation in SSE code.
...

 tim 

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [linux-audio-dev] Traps in floating point code