On Sun, Oct 19, 2008 at 2:55 AM, Paul Davis <paul(a)linuxaudiosystems.com> wrote:
i don't know whether to shoot myself or eat
another couple of the
oft-promised hats.
Don't beat yourself up too badly. Multiple threads accessing shared memory
is *very* tricky. Even smart people (like you) get it wrong sometimes.
so the next question is how best to prevent it. as far
as i can see we
have a couple of proposals:
1) fons' design, which never actually wraps readptr or writeptr, but
masks the address used to access the data buffer
2) removing the intermediate state's visibility
i admit to preferring (2) even though i know that with a 64 bit index,
not wrapping the ptrs is not really a problem.
I also prefer (2).
however, it is not totally clear to me how to prevent
an optimizing
compiler from doing the wrong thing here. unlike the claims made by
someone involved with portaudio, i believe that it is correct to declare
the read_ptr and write_ptr volatile, so that the compiler knows that it
cannot try to be clever about it accesses of the "other" ptr (i.e.
read_ptr for the writing thread, and vice versa). maybe the comment on
the use of volatile was based on some idea that i thought it made the
variables thread safe, which is and never was the case.
i suspect that the safest code looks like:
size_t tmp = (ptr + incr) & mask;
barrier();
ptr = tmp;
but i am not sure whether barrier needs to be read, write or both.
i think that the simpler code:
ptr = (ptr + incr) & mask;
is subject to potential compiler and/or processor "optimization" that
might reduce it back to the problem case of two ops without an
intermediate load/store location. the volatile declaration ought to
prevent the compiler from doing this, and i don't see why a processor
would do this, ever, but clearly i've already been deeply wrong about
this. does anybody know for certain?
It is best to avoid assumptions about what some future compiler may
consider an "optimization". If the register pressure is high at some
point in the program, it may decide to store some value just to free up
register space for other variables.
The "volatile" declaration should remove any need for the compiler
barrier() statement, AFAICT. Note that barrier() is a compiler
directive, and has no effect on the CPU's ability to reorder cache
operations in an SMP memory hierarchy.
Here is a fairly clear and complete description of the memory barrier
issues:
http://lxr.linux.no/linux/Documentation/memory-barriers.txt
As best I can tell, both ring buffer threads require "general" memory
barriers, because they both read and write (different) shared data. In
Linux kernel terms, they'd need to use smp_mb(), since the problems
they address are multiprocessor-only. But, since JACK is not compiled
differently for UP and SMP, the full mb() seems more appropriate.
That stuff makes my head hurt.
--
joq