On Thursday 01 July 2004 14:41, Tim Goetze wrote:
[Ruben van Royen]
please note that SSE2 has support for 64bit floats
(doubles) and contains
an instruction that truncates to int, irregardless of controlwords. A new
enough gcc with (-march=pentium4 or -msse2) and -mfpmath=sse will use sse
instead of the old fp unit. This has more advantages, since sse math uses
normal registers instead of the stack in the old fp unit.
The disadvantage is of course that it does not run on older processors.
I'm also not sure what level of sse athlon currently supports. The last
time I looked, it only supported sse. This is also good, but it lacks
support for double precision floatingpoint.
afaik, the athlon XP here only has SSE (not ~2), but the instruction
set includes this (quote taken from the NASM documentation, section
B.4):
CVTTSD2SI reg32,xmm/mem32 ; F3 0F 2C /r [KATMAI,SSE]
Yes, that one is part of sse. SSE2 adds a 64bit variant, so it also works with
doubles.
CVTTSS2SI converts a single-precision FP value in the source operand
to a signed doubleword in the destination operand. If the result is
inexact, it is truncated (rounded toward zero).
The destination operand is a general purpose register. The source can
be either an XMM register or a 32-bit memory location. If the source
is a register, the input value is in the low doubleword.
-
the operand requirements are quite different from "fistpl" so
replacing one with the other requires some additional instructions
to move the data around.
If you must first move the data from an FP register to an XMM register, it is
not very likely that you will get a performance improvement. The route to go
would be to do all calculation in SSE code.
tim