On Sat, 2008-10-18 at 23:24 -0500, Jack O'Quin wrote:
If the amount read or written exactly fills the
buffer, then a read or write
pointer equal to the last entry plus one will be stored momentarily, before
the correct (masked) value wraps it back to the beginning of the buffer. If
the other thread looks at it in that state, I believe data will be
copied outside
the bounds of the buffer, which is bad.
this is indeed the case. for read space, if the write ptr is "ahead" of
the read ptr, we return write_ptr - read_ptr, which will return a value
that is too large if "write_ptr" gets snapshotted after the increment
and before the mask.
the same is true for writing. if the read ptr is ahead of the write ptr,
we return the difference between them minus one (the extra -1 stops the
write ptr catching up with the read ptr, a condition that indicates that
the buffer is empty). this value will also be too large if the read_ptr
is used in its "intermediate" state.
i am deeply embarrassed but also puzzled. i remember worrying about
precisely this issue on and over across a period of several years, and
yet satisfying myself that there actually wasn't a problem. i think that
my mistake was as follows: if you look at the read space case, but focus
on whether the *read* ptr is too large at that time, then the safety of
the ringbuffer is still guaranteed. ditto, in the write space case, the
write ptr being too large doesn't violate the safety guarantee. but this
is backwards - the thread that computes write space is the writing
thread, which is the only one that modifies write_ptr. ergo, it never
sees write_ptr as volatile. ditto, the reading thread never sees
read_ptr as volatile. the problems arise because of the *other* pointer,
where the temporary intermediate state *does* cause the computations to
be in error.
i don't know whether to shoot myself or eat another couple of the
oft-promised hats.
so the next question is how best to prevent it. as far as i can see we
have a couple of proposals:
1) fons' design, which never actually wraps readptr or writeptr, but
masks the address used to access the data buffer
2) removing the intermediate state's visibility
i admit to preferring (2) even though i know that with a 64 bit index,
not wrapping the ptrs is not really a problem.
however, it is not totally clear to me how to prevent an optimizing
compiler from doing the wrong thing here. unlike the claims made by
someone involved with portaudio, i believe that it is correct to declare
the read_ptr and write_ptr volatile, so that the compiler knows that it
cannot try to be clever about it accesses of the "other" ptr (i.e.
read_ptr for the writing thread, and vice versa). maybe the comment on
the use of volatile was based on some idea that i thought it made the
variables thread safe, which is and never was the case.
i suspect that the safest code looks like:
size_t tmp = (ptr + incr) & mask;
barrier();
ptr = tmp;
but i am not sure whether barrier needs to be read, write or both.
i think that the simpler code:
ptr = (ptr + incr) & mask;
is subject to potential compiler and/or processor "optimization" that
might reduce it back to the problem case of two ops without an
intermediate load/store location. the volatile declaration ought to
prevent the compiler from doing this, and i don't see why a processor
would do this, ever, but clearly i've already been deeply wrong about
this. does anybody know for certain?
--p