[Tim Blechmann]
On Fri, 25 Jun 2004 17:38:24 -0500
Jan Depner <eviltwin69(a)cableone.net> wrote:
> On Fri, 2004-06-25 at 13:49, Tim Blechmann wrote:
> > > I have a denormal fix without a branch but you probably don't want
> > > to see it ;-)
> > > It's pretty simple, just OR the bits of the exponent together
> > > which gives either
> > > 0 (denormal) or 1, typecast that to float, and then multiply the
> > > original float by that (0.0 or 1.0). Voila, no branch, but it is
> > > messy looking ;-)
indeed sounds more like a fun proposal; nonetheless i'm wondering how
many cycles 'ORing the exponent bits together' would take. with an
8-bit exponent and no assumption about its value made, 8 binary
'shift', 7 'or' and 1 'and' statement if i'm not badly
mistaken. and
if i'm not, a branch will probably hurt less.
(on a related note, is it true that the P4 units have a weakness at
binary shift operations?)
> The definition of denormal is that the exponent
is 0 so you will
> never
> multiply a denormal by 1, only by 0. I'm not sure whether that would
> be a denormal operation or not. It depends on the compiler.
i'd expect the FPU to shortcut a multiplication by zero no matter if
the other operand is denormal or not, though i'm not too sure.
hm ... someone should write a test for all these
algorithms ... i'm
curious, how different compilers / different algorithms actually affect
the speed of the code ...
i think that the best denormal-avoiding approach depends on the
DSP algorithm in question and the extents of your demand for
computational accuracy and S/N ratio. lots of tradeoffs are possible.
tim