Ohh, this is very interesting... Thanks Juan for the code snippets!
I did some short tests with extremely varying and surprising results. But... I
got no time right now, will proceed with more tests tonight.
As much as don't like it, I have to support the claim that integer is faster
as of now...
But I'll do my best to prove otherwise tonight. I'll post my results here.
/Robert
Friday 17 October 2003 01.09 skrev Juan Linietsky:
Benno Sennoner and I were discussing today on IRC
about
the usual fixed point vs floating point (regarding to some resampling code)
We developed some tests and ran them on a variety of computers.
It would be interesting if ladders here could run them on different
computers (and specially non x86, like amd64. or the Gx processors) so we
can see what performance can we expect on each , and how things
seem to be shaping for the future. It will also be a key factor
on how our projects will develop in the future.
The code is available at:
http://reduz.dyndns.org/resamp_fixp.c // fixed point version
http://reduz.dyndns.org/resamp_float.c // floating point version, portable
http://reduz.dyndns.org/resamp_float_fistl.c // X86 VERSION ONLY!! Uses
fistl instruction
The results dont mean just int vs float performance. They
also test for float->int conversion, which is common
in most algorithms that work with buffers. It is a linearly
interpolated resampling with volume control.
please use GCC options -
-O3 -ffast-math -march=<yourcpu> so it's a fair comparison
results from other compilers is also appreciated!
Here's some results for reference
****************************************************************
vendor_id : AuthenticAMD
model name : AMD-K6(tm) 3D processor
cpu MHz : 412.508
cache size : 64 KB
flags : fpu vme de pse tsc msr mce cx8 pge mmx syscall 3dnow
k6_mtrr bogomips : 822.47
resamp_fixp - 0m8.460s
resamp_float_fistl - 0m27.390s
****************************************************************
vendor_id : AuthenticAMD
model name : AMD Duron(tm) Processor
cpu MHz : 951.701
cache size : 64 KB
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov
pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips : 1900.54
resamp_float - 0m11.180s
resamp_float_fistl - 0m5.810s
resamp_fixp_optimized - 0m2.790s
************************************************
Benno gave me some results:
Intel p4 1800 celeron
resamp_float_fistl - 4.00user
resamp_fixp - 4.51user
(float faster?)
--------------------------------
VIA nehemiah 1GHz
resamp_float_fistl - 0m21.079s
resamp_fixp - 0m7.129s
*************************
Some results on SPARC:
2x 125MHz HyperSPARC
resamp_fixp - user 2m59.010s
resamp_float - user 0m44.290s
(float faster! :)
****************************
Intel 2.4 GHZ P4:
resamp_float - 0m3.440s
resamp_float_fistl - 0m2.960s
resamp_fixp - 0m1.450s
************************
1.25GHz G4 (laptop)
resamp_float - 0m11.170s
resamp_fixp - 0m2.130s
-----------------------------------
Conclusions SO FAR.
My own conclusions about the subject is that the float -> int conversion is
STILL the biggest bottleneck on most common architectures. And until this
is sorted out, fixed point is still the best solution for some specific
cases, and I dont see any problem mixing it with floating point code. If
you look at that algorithm closely in the source, you could replace
"counter/ increment"
for purely fixed point values, and do the rest (managing the samples) in
float. This will undoubly speed it up..
Cheers!
Juan Linietsky