On 5/27/05, David Cournapeau <cournape(a)gmail.com> wrote:
There are 2 things which make it a bit difficult to
read for me:
- the first is all the thread synchronisation stuff (I intend to
look at that problem later),
- the 2 threads thing : dsp thread, jack thread (the "more" RT one).
My main problem is the 2 threads thing. If I understand correctly, the
way it works is
1) io_init starts it all (from the point of view of module io): it
parses some cmd line options, register the callbacks to jack in 'normal
mode' (ie no dummy mode), and calls process_init. My first problem: is
dsp_block_size the *fixed* size of your actual dsp algorithm ? It looks
like it, but I am not sure as it is defined in an other module.
Yes. The FFT wants to use 256-sample windows. Most of the
multi-thread complexity was driven by a desire to do it that way
even when JACK is running with smaller buffers. Otherwise, we
found that the computational overhead of the FFT went up at the
worst possible moment, when running at low latency.
2) io_activate is the 2d function of io module
called by main, which
register jack ports, and creates a DSP thread is necessary
Yes. It also calls jack_activate(), which creates the JACK process
thread.
3) If no dsp thread is created, once io_activate
returns, jack may
begin to call io_process.
JACK can do that at any time after the call to jack_activate().
The DSP thread is normally always created, except at user request
or if thread creation failed for some strange reason.
io_process is the function where the buffering which I
am interested in
happens, right ? My problem is that I don't understand the DSP thread
thing, and I am a bit confused. Can I just consider the case where no
dsp thread is created to understand the buffering issue ? What is the
dsp thread for exactly ?
io_process() is the JACK process handler, called in the JACK realtime
thread. If the current JACK buffer size is large enough to accomodate
the dsp_block_size, then no queuing is required and io_process() calls
process_signal() directly in the JACK thread. This is the usual, simple
case.
Note that the JACK buffer size can change. If it does, JACK calls
io_bufsize(), which may adjust the buffer latency, if necessary.
When the buffer size is less than dsp_block_size, io_process does not
handle each buffer synchronously. Instead, io_queue() queues the
buffer to a lower-priority realtime thread and schedules that thread
to run if there are enough samples for it (io_schedule). When
the io_dsp_thread() wakes up, it calls process_signal() with the larger
block size, queuing the results back via its output ringbuffers.
Closing the loop, io_queue() running in the higher-priority JACK
thread copies those output ringbuffers into the corresponding JACK
output ports. This results in an additional buffering latency, the
main cost of running this additional thread. In each io_process()
call, we queue input buffers for later handling and return output
buffers that were processed earlier.
Sorry if I sound stupid, but my experience in C is
mainly in number
crunching modules for matlab/octave extensions, and all this real time
thing is really new to me :)
Multithread realtime programming is never simple. Less experienced
progammers often underestimate it. You were smart to ask.
--
joq