Benno Sennoner and I were discussing today on IRC about
the usual fixed point vs floating point (regarding to some resampling code)
We developed some tests and ran them on a variety of computers.
It would be interesting if ladders here could run them on different computers
(and specially non x86, like amd64. or the Gx processors) so we can
see what performance can we expect on each , and how things
seem to be shaping for the future. It will also be a key factor
on how our projects will develop in the future.
The code is available at:
http://reduz.dyndns.org/resamp_fixp.c // fixed point version
http://reduz.dyndns.org/resamp_float.c // floating point version, portable
http://reduz.dyndns.org/resamp_float_fistl.c // X86 VERSION ONLY!! Uses fistl
instruction
The results dont mean just int vs float performance. They
also test for float->int conversion, which is common
in most algorithms that work with buffers. It is a linearly
interpolated resampling with volume control.
please use GCC options -
-O3 -ffast-math -march=<yourcpu> so it's a fair comparison
results from other compilers is also appreciated!
Here's some results for reference
****************************************************************
vendor_id : AuthenticAMD
model name : AMD-K6(tm) 3D processor
cpu MHz : 412.508
cache size : 64 KB
flags : fpu vme de pse tsc msr mce cx8 pge mmx syscall 3dnow k6_mtrr
bogomips : 822.47
resamp_fixp - 0m8.460s
resamp_float_fistl - 0m27.390s
****************************************************************
vendor_id : AuthenticAMD
model name : AMD Duron(tm) Processor
cpu MHz : 951.701
cache size : 64 KB
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat
pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips : 1900.54
resamp_float - 0m11.180s
resamp_float_fistl - 0m5.810s
resamp_fixp_optimized - 0m2.790s
************************************************
Benno gave me some results:
Intel p4 1800 celeron
resamp_float_fistl - 4.00user
resamp_fixp - 4.51user
(float faster?)
--------------------------------
VIA nehemiah 1GHz
resamp_float_fistl - 0m21.079s
resamp_fixp - 0m7.129s
*************************
Some results on SPARC:
2x 125MHz HyperSPARC
resamp_fixp - user 2m59.010s
resamp_float - user 0m44.290s
(float faster! :)
****************************
Intel 2.4 GHZ P4:
resamp_float - 0m3.440s
resamp_float_fistl - 0m2.960s
resamp_fixp - 0m1.450s
************************
1.25GHz G4 (laptop)
resamp_float - 0m11.170s
resamp_fixp - 0m2.130s
-----------------------------------
Conclusions SO FAR.
My own conclusions about the subject is that the float -> int conversion is
STILL the biggest bottleneck on most common architectures. And until this is
sorted out, fixed point is still the best solution for some specific cases,
and I dont see any problem mixing it with floating point code. If you look
at that algorithm closely in the source, you could replace "counter/
increment"
for purely fixed point values, and do the rest (managing the samples) in
float. This will undoubly speed it up..
Cheers!
Juan Linietsky