On Thu, 22 Nov 2018 23:29:11 +0100
Robin Gareus <robin(a)gareus.org> wrote:
Hi Will,
I just ran your code and -ffast-math does not make any difference.
With or without --ffast-math I get "int: 5 rem: 0.049994"
However optimizing the code with `-O2 --ffast-math` does make a
difference because SSE is used.
Do you also use -O2, or -O3 along with --fast-math?
On 11/22/2018 06:30 PM, Will Godfrey wrote:
[...]
My suspicion is that the difference is due to
accumulated rounding
errors.
Curiously without the decrements the behavior with and without
-ffast-math seems to be identical well into the millions.
Yep. you do loose
precision. In IEEE 32bit float, there is no 0.1.
0.1 would be approximated as (13421773) * 2^(-27), and 0.05 as
(13421773) * 2^(-28) and truncated.
Note that this is an odd number. The last bit of the mantissa
(0x004ccccd) is lost when the exponent changes.
A simpler example to show this is
#include <stdio.h>
#include <math.h>
int main (int argc, char** argv) {
float a = 0;
for (int i = 0; i < 100; ++i) {
a += 0.1f;
a -= 0.05f;
a = fmodf (a, 1.f);
}
printf ("%f\n", a);
return 0;
}
using gcc 6.3.0-18+deb9u1, x86_64, this
prints 1.000000 (when compiled with -O0)
and 0.000001 (when compiled with -O2 --fast-math)
The difference here is that the former calls fmodf(), while in the
latter, optimized, case the compiler uses cvtss2sd SSE instead. The
internal limits of float precision differ for those cases.
--ffast-math issues in audio-dsp are usually due to re-ordering and
reciprocal approximations. e.g. instead of (x / 2) the compiler calls
(0.5 * x), which can lead to different results if the inverse value does
not a precise floating point representation. e.g. 1.0 / 0.1 != 10.0
A good read on the subject is
http://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf (Goldberg,
David: "What Every Computer Scientist Should Know About Floating-Point
Arithmetic").
If you were change the two constants to
a += 0.5f;
a -= 0.25f;
the issue vanishes.
Cheers!
robin
Thanks for going into this in such detail Robin. I never realised fp stuff
could be *quite* so, umm, approximate!
I was using O3. Following on from this I found that if I removed either O3 or
ffast-math the problem disappeared.
It's quite possible that I'm over-thinking this as there are actually only a
few iterations before the initial figures are re-calculated, but I don't like
mysteries :(
--
Will J Godfrey
http://www.musically.me.uk
Say you have a poem and I have a tune.
Exchange them and we can both have a poem, a tune, and a song.