On Wed, 29 Jun 2005 13:20:31 +0200
Benno Senoner <sbenno(a)gardena.net> wrote:
assume we run convolution at 512 samples.
process(float *input,float *output, int numframes) {
if(numframes == 512) {
convolve(input, output, 512);
return;
}
This has a subtle bug afaict. Let's assume the host called several
process() with numframes != 512 first, then one with numframes == 512, i.e.:
1. 123
2. 432
3. 234
4. 512
The 4th process call disregards data already in the ringbuffer which has
been put there by previous calls with numframes != 512. There needs to
be an additional test for whether the ringbuffer is empty. in the case
that the host uses constant buffer size this would work out alright
though and is indeed identical to the behaviour with the additional
test.
I can imagine though that subdividing periods into smaller parts is very
very useful especially when automating plugins (and having control data
changes which are not at period boundaries), so the fact that the
constant buffer size is not garanteed does make very much sense.
But in the case of a plugin which operates with fixed buffer sizes (and
uses internal buffering as described) this approach wouldn't make sense,
as the control data change wouldn't have any effect anyways (or it would
be nontrivial to hack it into the algo, even if it is possible (i.e.
gain changes could take effect at non buffer size boundaries for
partitioned convolution)).
So i'm very much in favour of Chris' Proposal to add a hint that makes
the host use the same buffer size all the time.
I would actually be very much in favour to make this the default
behaviour and timestamp control change events as it's done in VST [see
below]..
[snip]
the first time process() is called the >=512
condition is not satisfied
and thus a 0 filled buffer is returned (silence).
At the second process() call, the >=512 condition is satisfied (there
are exactly 512 frames in the buffer).
And the convolve() function is called, eating 9.2msec of CPU.
Since 9.2msec > 5.5msec ... sh*t happens ... XRUN.
exactly.
If numframes supplied by the host is bigger than 512
then there are no
CPU spike problems.
For example if the host supplies 1024 frames, the above code would call
convolve() 2 times outputting 1024 frames. (eating 2x9.2msec out of the
22msec available)
It would be a bit inefficient because if the plugin knows that the host
supplies at least 1024 frames
then you could run the convolution at 1024 achieving greater efficiency.
If the host guarantees that it always supplies the same number of frames
then the convolver could adjust
it's internal framesize to to achieve optimal CPU usage.
Right. That's why the hint suggested by Chris would be useful..
If not then a scheme like the above one is
unavoidable.
Just for curiousity, does anyone know that's the current status of the
variable/fixed buffer sizes scenarios
supplied to plugins by hosts on various plugin platforms like VST, AU etc ?
Afaik in VST [i heard it somewhere, no garantees about correctness] the
plugin knows about the host's buffer size. And the plugin will always be
called with that buffer size [dunno if power of two is garanteed, but it
would be sensible]. Control params are timestamped and provided as a
list of value/frame pairs for the current process buffer, so the need to
subdivide the buffer for finer grained automation/etc is not given. At
the cost of some extra work on the plugin part.
Personally i like this approach better than the
LADSPA-not-garanteed-buffer-size approach (but i am biased).
Florian, since we would like to add convolution to
LinuxSampler over
time it would be cool if you could add the above
ideas to libconvolve so that one can use the lib without worrying about
supplying the right buffer sizes etc, and
in plugin hosts enviroments it would be handy too since we don't always
know what the host will do.
Actually how to solve this problem is application specific:
Are cpu spikes preferrable over context switches (which a threading
solution (which would even the load) would require (plus some extra
latency)?
I mean i could add above (nonthreading) mechanism and provide an extra
function call for it, but i'd rather not since it hides a fundamental
aspect of the partitioned convolution which every user should be aware
of and for which a different solution might be more suited to the
application at hand..
Plus i personally don't like the cpu spikey non threading solution at
all for exactly the reasons you mentioned ;)
Flo
--
Palimm Palimm!
http://affenbande.org/~tapas/