Hi all,
please note that SSE2 has support for 64bit floats (doubles) and contains an
instruction that truncates to int, irregardless of controlwords. A new enough
gcc with (-march=pentium4 or -msse2) and -mfpmath=sse will use sse instead of
the old fp unit. This has more advantages, since sse math uses normal
registers instead of the stack in the old fp unit.
The disadvantage is of course that it does not run on older processors. I'm
also not sure what level of sse athlon currently supports. The last time I
looked, it only supported sse. This is also good, but it lacks support for
double precision floatingpoint.
Ruben
On Wednesday 30 June 2004 23:09, Tim Goetze wrote:
[Jens M Andreasen]
On tis, 2004-06-29 at 17:15, Steve Harris wrote:
integer = lrintf(fullindex);
fractional = fullindex - integer;
I dont think this is right, fractional will be [-0.5, 0.5], rather than
[0,1] which is more noirmal as lrintf() rounds to the nearest.
I think you should be using lrintf(floor(x)) or (int)x.
Why not just use modf?
double fullindex, increment, integer, fraction;
// int i;
fullindex += increment;
fraction = modf(fullindex, &integer);
// i = integer;
if we want to use the integer part as sample index for a memory
lookup, having it in double/float doesn't buy us much: we still need
to convert to int type, which is costly.
regarding lrintf() on x86: checking the output of gcc-3.0 -S, you can
see that it compiles to a simple "fistpl" instruction. this
instruction relies on the current FPU control word, which defaults to
'round' not 'truncate' (which isn't what we want for memory
indices).
now, "i = lrintf (floor (f))" compiles to "frndint" surrounded by
"fldcw" (load control word, which is __slow__), and finally
"fistpl".
in contrast, "i = (int) f" compiles to a single "fistpl", but of
course (the gcc default FPU control word defaulting to 'round') also
surrounded by "fldcw".
so if you want quick fractional sample lookups, the best option on x86
i see is to manually "fldcw" before and after your sample loop, and
use lrintf() or "fistpl" directly to obtain integer indices inside
the loop.
incidentally, you can find a portable implementation in the caps
package.
tim