[LAD] olivier's jack ringbuffer tests...

Joern Nettingsmeier nettings at folkwang-hochschule.de
Mon Oct 20 15:02:41 UTC 2008

hi everyone!

for the benefit of those JACK devs who don't follow linux-audio-* (if
any), and for posterity in the JACK mailing list archives:

there has been a very insightful discussion on LAU/LAD about ringbuffer
peculiarities and a possible bug. olivier guilyardi has written test
cases that pinpoint those problems.

he has made them available at http://svn.samalyse.com/misc/rbtest.

can i suggest that the rest of this discussion take place on or at least
be cc:ed to jack-devel?


jörn (who has no idea at all about concurrent programming, but knows a
bit about community engineering and data mining :-D)

Olivier Guilyardi wrote:
> Jack O'Quin wrote:
>> On Sun, Oct 19, 2008 at 2:55 AM, Paul Davis <paul at linuxaudiosystems.com> wrote:
>>> i don't know whether to shoot myself or eat another couple of the
>>> oft-promised hats.
>> Don't beat yourself up too badly.  Multiple threads accessing shared memory
>> is *very* tricky.  Even smart people (like you) get it wrong sometimes.
> I agree with Jack. I remember a friend of mine once qualifying multi-threading
> of "kernel quantization". This is complex but also challenging :)
> Although your understanding of these matters is far beyond mine, I think that I
> have a good professional experience of coding methodology. And to me the problem
> here is indeed methodological: there is a lack of unit tests in JACK.
>>> so the next question is how best to prevent it. as far as i can see we
>>> have a couple of proposals:
>>>   1) fons' design, which never actually wraps readptr or writeptr, but
>>>      masks the address used to access the data buffer
>>>   2) removing the intermediate state's visibility
>>> i admit to preferring (2) even though i know that with a 64 bit index,
>>> not wrapping the ptrs is not really a problem.
>> I also prefer (2).
>>> however, it is not totally clear to me how to prevent an optimizing
>>> compiler from doing the wrong thing here. unlike the claims made by
>>> someone involved with portaudio, i believe that it is correct to declare
>>> the read_ptr and write_ptr volatile, so that the compiler knows that it
>>> cannot try to be clever about it accesses of the "other" ptr (i.e.
>>> read_ptr for the writing thread, and vice versa). maybe the comment on
>>> the use of volatile was based on some idea that i thought it made the
>>> variables thread safe, which is and never was the case.
>>> i suspect that the safest code looks like:
>>>        size_t tmp = (ptr + incr) & mask;
>>>        barrier();
>>>        ptr = tmp;
>>> but i am not sure whether barrier needs to be read, write or both.
>>> i think that the simpler code:
>>>       ptr = (ptr + incr) & mask;
>>> is subject to potential compiler and/or processor "optimization" that
>>> might reduce it back to the problem case of two ops without an
>>> intermediate load/store location. the volatile declaration ought to
>>> prevent the compiler from doing this, and i don't see why a processor
>>> would do this, ever, but clearly i've already been deeply wrong about
>>> this. does anybody know for certain?
>> It is best to avoid assumptions about what some future compiler may
>> consider an "optimization".  If the register pressure is high at some
>> point in the program, it may decide to store some value just to free up
>> register space for other variables.
>> The "volatile" declaration should remove any need for the compiler
>> barrier() statement, AFAICT.  Note that barrier() is a compiler
>> directive, and has no effect on the CPU's ability to reorder cache
>> operations in an SMP memory hierarchy.
>> Here is a fairly clear and complete description of the memory barrier
>> issues:  http://lxr.linux.no/linux/Documentation/memory-barriers.txt
>> As best I can tell, both ring buffer threads require "general" memory
>> barriers, because they both read and write (different) shared data.  In
>> Linux kernel terms, they'd need to use smp_mb(), since the problems
>> they address are multiprocessor-only.  But, since JACK is not compiled
>> differently for UP and SMP, the full mb() seems more appropriate.
>> That stuff makes my head hurt.
> Maybe that is because your approach is too theoretical. I've read many theories
> about whether memory barriers are needed or not in lock-free ring buffers, and I
> think it might lead nowhere without experimenting, that is: write test cases.
> Until now, I just couldn't write a test that fail because of the lack of memory
> barrier on x86. However, I think I found another bug in Jack ringbuffer, by
> writing another test.
> It's a bit of a weird test, I call it "bit circle": I start 16 "node" threads.
> Each node :
> - communicate with its adjacent node through 2 ring buffers
> - is responsible for shifting an integer by one bit
> - send the shifted result to the next node through a ring buffer
> - checks that the value read from previous node is correct.
> On my Debian Quad Core and Mac OS X Core Duo boxes, this test fails with Jack's
> ringbuffer even with my patch applied. But it succeeds with Portaudio (both with
> and without memory barriers), and also with Fons' implementation.
> I wish someone could eventually run my test suite on a PowerPC, especially SMP.
> The usage is unchanged:
> svn co http://svn.samalyse.com/misc/rbtest
> cd rbtest
> make test
> The run time is now shorter, about 2 minutes. Below is the output. You can see
> that jack (test-bit-circle-jack) fails the bit circle test, even when patched
> (test-bit-circle-jack-fix1). "|" is printed when a node thread starts, and "-"
> when a node has read 1000 values, to ensure data is really flowing.
> The test-int-array-* tests are the same as my original test. All *-lfq tests use
> Fons's Lfq ringbuffer (modified to use char instead of uint32_t) as backend.
> ./run-tests.sh bit-circle 512 jack jack-fix1 portaudio portaudio-nobarrier lfq
> test-bit-circle-jack - starting (5s max) - buffer size: 512
> || FAILURE: 2 != 514
> test-bit-circle-jack-fix1 - starting (5s max) - buffer size: 512
> || FAILURE: 2 != 514
> test-bit-circle-portaudio - starting (5s max) - buffer size: 512
> ||-|-|-||-|-|-|-|-||--|||--|---- SUCCESS
> test-bit-circle-portaudio-nobarrier - starting (5s max) - buffer size: 512
> |||--|-|||---||-||-|-|--||--|--- SUCCESS
> test-bit-circle-lfq - starting (5s max) - buffer size: 512
> |||--|||-|||-|||--|||-------|--- SUCCESS
> ./run-tests.sh int-array 16 jack jack-fix1 portaudio portaudio-nobarrier lfq
> test-int-array-jack - starting (20s max) - array/buffer size: 8/16
> [reader started] [writer started] [flowing] 52572 != 52568 at offset 0
> FAILURE in chunk 91822
> test-int-array-jack-fix1 - starting (20s max) - array/buffer size: 8/16
> [reader started] [writer started] [flowing] SUCCESS
> test-int-array-portaudio - starting (20s max) - array/buffer size: 8/16
> [reader started] [writer started] [flowing] SUCCESS
> test-int-array-portaudio-nobarrier - starting (20s max) - array/buffer size: 8/16
> [reader started] [writer started] [flowing] SUCCESS
> test-int-array-lfq - starting (20s max) - array/buffer size: 8/16
> [writer started] [reader started] [flowing] SUCCESS

jörn nettingsmeier

home://germany/45128 essen/lortzingstr. 11/

Kurt is up in Heaven now.

More information about the Linux-audio-dev mailing list