Simon Jenkins wrote:
If you're doing this sort of thing you need to fix
it with a couple
more brackets...
#define FLUSH_TO_ZERO(fv) ((((*(unsigned
int*)&(fv))&0x7f800000)==0)?0.0f:(fv))
...and then, as far as I can see, the macro always does exactly what
it is supposed
to do.
Whoops... I spoke too soon. I've found a way to make the macro fail:
/*====================================================
Program to demonstrate FLUSH_TO_ZERO macro failure.
S.Jenkins 2003
Using GCC 2.95.4
Macro fails at optimisation -O1 or above if -fstrict-aliasing is also used.
====================================================*/
#include <stdio.h>
#define FLUSH_TO_ZERO(fv) ((((*(unsigned
int*)&(fv))&0x7f800000)==0)?0.0f:(fv))
float TestResult;
int main (int argc, char **argv)
{
int j;
/* set TestResult to a value that is just barely above the denormal
float threshold */
*(unsigned int *)&TestResult = 0x00800001;
/* now decay TestResult down into the denormals... */
for( j = 0; j <1024; j++ ) {
TestResult = TestResult * 0.999f;
/* ...but flush it to zero (causing all subsequent values to be zero) as
soon as
it actually becomes denormal */
TestResult = FLUSH_TO_ZERO(TestResult);
}
/* Result should be zero and - if we instrument the code - we should only
see ourselves paying the penalty of a single denormal calculation */
printf( "TestResult: %e\n", TestResult );
exit(0);
}
/*=== the end ==========================================*/
When compiled with -01 -fstrict-aliasing, this produces a denormal result
and takes a relatively long time doing so.
I guess that the -fstrict-aliasing option is telling the optimiser to ignore
the possibility that a pointer to unsigned int might actually point to a
global float variable, so it believes the "unsigned int" that the macro is
testing cannot be the same entity as the float that the loop is modifying.
So it moves the test outside of the loop.
Anyway, enough doom and gloom. Here's the fix:
#define FLUSH_TO_ZERO(fv) ((((*(volatile unsigned
int*)&(fv))&0x7f800000)==0)?0.0f:(fv))
(Note: This will stop working once Juhana gets the "volatile" keyword
removed from the C language :) )
Simon Jenkins
(Bristol, UK)