On Sat, Apr 05, 2003 at 06:15:09 +0200, Ingo Oeser wrote:
Now make that thread-safe and esp. thread-safe on an
architecture
with weak memory ordering and all the fun stuff.
Sure, it will only work on architectures where 32bit reads and writes are
atomic.
If you have that all working and look at the assembly,
then my
solution is actually smaller, since all your pointer updates are
required to be atomic anyway and that's what we tryed to optimize
away here.
Smalled is not the issue, its branches that are important in inner loops.
Yes, your solution is a perfect DSP-chip or hardware
solution. It
is even nice for ix86, if you have a not that optimizing
compiler.
I dont think compiler will optimise away the trhread safeness, unless I've
missed something. A vectorising compiler might unroll the loops but it
will still keep the ordering, and the aligned vector operations are still
atomic AFAIK.
- Steve