[linux-audio-dev] lock-free data structures

Benno Senoner sbenno at gardena.net
Fri Jun 18 03:44:07 UTC 2004

Previous message: [linux-audio-dev] lock-free data structures
Next message: [linux-audio-dev] lock-free data structures
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Paul Davis wrote:

>>in a lock-free way. This ensures zero-copy operation.
>>    
>>
>
>until you want to start processing the data but keep the original
>around. i was always attached to the zero-copy model, but it just
>doesn't seem to pan out in real life.
>  
>
I don't know how ardour works internally, so in your case it might not 
work but
for example  the case of LinuxSampler we simply do
disk thread fills ringbuffer
audio thread reads from ringbuffer applies some DSP processing 
(envelopes,filter etc) and then
sends the data to the audio card.
In that case we can operate completely lock-free.

>>Plus we added a
>>wrap space so that a section of the beginning of the buffer is
>>replicated after the official upper bound so that the audio thread
>>can read a bit past of it and still gets the correct audio data (as
>>it was linear), this speeds up the audio interpolation since for the
>>audio thread it's like reading from a linear segment, no nasty if()
>>checks etc, it's all done (from time to time, so no pratical CPU
>>overhead) when the disk thread writes the data to the ringbuffer.
>>    
>>
>
>i still don't understand this. you can easily construct a worst case
>scenario where the reader needs to read just a tiny amount from the
>"end" of the buffer and the rest from the front. in this, it seems to
>me that your "wrap" space actually has to be the same size as the
>buffer itself - i.e. you're actually writing two copies of the data. 
>
>when you use varispeed in ardour, its very easy to make this situation
>occur. i handled it a different way, one that is less demanding on
>memory usage that a double-sized ringbuffer because reads are limited
>to the process() blocksize.
>  
>

There is no worst case and the overhead of our mechanism is very small.
 It works like that:

|------ official ringbuffersize (2^n) ------ | ---- wrap space ---|

diskthread:
while(1) {

- figure out available bufferspace to ringbuffer end (which includes 
wrap space)

- read() from disk directly into the ringbuffer (starting from 
ringbuffer.write_ptr) memory till you fill the above amount

- after the read a ringbuffer macro is called which takes care to mirror 
the part of wrap space
to the buffer beginning via memcpy() (so no duplicate read()s from disk 
needed)

}

audiothread:

while(1) {

sample_t *sample=ringbuffer.read_ptr;

 process samples beginning at "sample"
 write data to the audio card
  increase the read_ptr in the ringbuffer

}

basically thanks to the wrap space the code in the audiothread can treat
the sample[] array like a linear array of samples, it does not know 
anything about
wrapping etc. the only thing needed over linear reads is the
increase of the read_ptr, basically read_ptr = (read_ptr + samples_read) 
& (ringbuffer_size -1)
as usual.

How to dimension the wrap space so that we avoid that the audio thread 
accidentally oversteps over buffer
boundaries ?
We do it the following way:
we limit the speedup a sample can have. eg a sample can played at max 
one octave higher which
means twice the speed.

Thus the wrap space is given by:
wrap_space = audio_card_fragment_size * 2.0 + 3;
the audio card fragment size (or jack frames) is constant during the 
processing so it can be preallocated.
of course if you want to increase the fragment size buffers do have to 
be reallocated but this rarely happens
and implementing it is not difficult.
the +3 is given by the fact that cubic interpolation fetches the current 
sample plus 3 subsequent samples.
(linear fetches the current plus one subsequent).

About memory overhead of the wrap space: it's low since those disk 
streaming buffers are usually
quite big, eg 128k samples, while the audio card's fragmentsize is low 
compared to that number, eg 512samples.
So even if you wanted to play a sample 3 octaves higher (8 times 
faster), you get
512 * 8 + 3 = 4K samples which is still only 3%.
Keep in mind that playing a sample 8 times faster puts quite a big load 
on the disk streaming system so
I guess you would need to increase the above buffer size which would 
lower the relative memory overhead
of the wrap space to below 1% or so.

I challenge anyone to come up with a simpler and more efficient approach 
for streaming samples with varispeed
from disk :)
Perhaps in ardour's case our approach is not ideal but in the case of 
LinuxSampler it's the most efficient approach
we can think of.

cheers,
Benno
http://www.linuxsampler.org

Previous message: [linux-audio-dev] lock-free data structures
Next message: [linux-audio-dev] lock-free data structures
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Linux-audio-dev mailing list