-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Mon, Jan 22, 2007 at 10:58:10AM -0500, Paul Davis wrote:
On Sun, 2007-01-21 at 22:21 +0100, Maarten de Boer
wrote:
Hi Paul
I have a question about something you say in your slashdot post:
"The overhead of calling the graph associated with the data flow for
the frames is not insignificant, even on contemporary processors.
Therefore, calling the graph the minimum number of times is of some
significance, significance that only grows as the latency is reduced.
Because of this, all existing designs, including ASIO and CoreAudio
(with the proviso that CoreAudio is *not* driven by the interrupt from
the audio interface) call the graph only once for every hardware buffer
segment/period/whatever."
Do you have some numbers to show how relevant this overhead actually is?
I mean, if I use a specific internal buffer size (say 128 samples),
independend of the system buffer size, would that really be noticable?
I can think of some situations where this would be preferable. For
example, if you have many points in your callgraph where a fixed buffer
size is required (say some FFT's). Rather than doing buffering at all
these points, it seems to make more sense to do the whole callgraph
with that buffer size. I hope I make myself clear...
I did some experiments, and did not notice any significant difference
using different internal buffer sizes for my call graph. I am talking
about a call graph within a single application, and maybe you were
talking about a call graph with context switches?
both, really.
if you get down to very, very small buffer sizes (say, 4), the overhead
of making the function calls gets to be a significant fraction of the
total time spent processing the audio. this is partly why running JACK
with low buffer sizes shows the "DSP load" as significantly higher, even
if almost nothing is happening. you will only see this effect with deep
call trees. inside a single, simple app, the effect is probably hard to
detect.
but yes, because of context switches between clients, it becomes even
more true in JACK. that effect is measurable and isn't tiny.
By coincidence, I was just experimenting with this a few hours ago, before reading this.
I'm puzzled-- and a bit worried-- by the difference between the high CPU load with
small -p values vs. low CPU load with high -p values.
It turns out it makes a significant difference on my system. If I load up a couple
fluidsynth instances, ardour, hydrogen, and rosegarden, and then jack-rack, running a
single processing chain of CAPS Tube Amp IV, CAPS Cabinet II, TAP Autopanner, and CAPS
Plate 2x2 Reverb (this is my Rhodes piano sound), then my CPU usage goes up to 50% with -p
64 -n 2, and absolutely nothing actually being played. With -p 1024 -n 3 the CPU usage
was about 8%. When my system was working, I was using -p 256 -n 3, which was passable for
most purposes although a little too laggy.
This is a Mac Mini Intel Core Duo 1.66Mhz, 2GB RAM, and FA-66 audio interface with
FreeBOB, running 2.6.19.1-rt15 and jack 0.101.1-2.
It looks like it might another system tuning complication now: if I lower my latency, I
will get Xruns and bad crackles due to too much CPU usage.
- -ken
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
iD8DBQFFtt24e8HF+6xeOIcRAgJPAKD1rPtwn/BoSp9T5MDul/st7GiH8QCg5oDr
2LIoj4qWQy5UGeM47Qc6cuo=
=ARq+
-----END PGP SIGNATURE-----