On Tue, 2003-08-05 at 01:48, Simon Jenkins wrote:
that have denormal operands, and it produces denormal
results when
calculations underflow. There is no hardware option to flush either denormal
operands or denormal results to zero. (I think that there is such an option
on Itanium processors though, and on some other processor families).
There is no such option for x86 family afaik. Flush-denormals-to-zero by
CPU/FPU requires use of SSE/3DNow for floating point calculations.
To get rid of at least some of the denormal performance problems one
could use "-march=pentium4 -msse2 -mfpmath=sse" on GCC or "-tpp7 -xW"
on
ICC.
The slow-down that happens with denormal calculations
isn't the result
of exception handling, its the result of the FPU hardware itself taking
a lot longer to perform the calculations.
This performance penaly is _very_ significant on Intel CPUs and rather
low on AMD ones. However, using SSE2 for fp math on P4 fixes this.
--
Jussi Laako <jussi.laako(a)pp.inet.fi>