First, I
don't understand why you want to design a "synth API". If
you want to play a note, why not instantiate a DSP network that
does the job, connect it to the main network (where system audio
outs reside), run it for a while and then destroy it? That is what
events are in my system - timed modifications to the DSP network.
99% of the synths people use these days are hardcoded, highly
optimized monoliths that are easy to use and relatively easy to host.
We'd like to support that kind of stuff on Linux as well, preferably
with an API that works equally well for effects, mixers and even
basic modular synthesis.
Besides, real time instantiation is something that most of us want to
avoid at nearly any cost. It is a *very* complex thing to get right
(ie RT safe) in any but the simplest designs.
Okay, I realize that now, maybe your approach is better. RT and really
good latency was and is not the first priority in MONKEY, it's more
intended for composition, therefore I can afford to instantiate units
dynamically. But it's good that someone is concerned about RT.
However, if
you want, you can define functions like C x =
exp((x - 9/12) * log(2)) * middleA, where middleA is another
function that takes no parameters. Then you can give pitch as "C 4"
(i.e. C in octave 4), for instance. The expression is evaluated and
when the event (= modification to DSP network) is instantiated it
becomes an input to it, constant if it is constant, linearly
interpolated at a specified rate otherwise. I should explain more
about MONKEY for this to make much sense but maybe later.
This sounds interesting and very flexible - but what's the cost? How
many voices of "real" sounds can you play at once on your average PC?
(Say, a 2 GHz P4 or someting.) Is it possible to start a sound with
sample accurate timing? How many voices would this average PC cope
with starting at the exact same time?
Well, in MONKEY I have done away with separate audio and control signals -
there is only one type of signal. However, each block of a signal may
consist of an arbitrary number of consecutive subblocks. There are three
types of subblocks: constant, linear and data. A (say) LADSPA control
signal block is equivalent to a MONKEY signal block that has one subblock
which is constant and covers the whole block. Then there's the linear
subblock type, which specifies a value at the beginning and a per-sample
delta value. The data subblock type is just audio rate data.
The native API then provides for conversion between different types of
blocks for units that want, say, flat audio data. This is actually less
expensive and complex than it sounds.
About the cost: an expression for pitch would be evaluated, say, 100 times
a second, and values in between would be linearly interpolated, so that
overhead is negligible. It probably does not matter that e.g. pitch glides
are not exactly logarithmic, a piece-wise approximation should suffice in
most cases.
I'm not sure about the overhead of the whole system but I believe the
instantiation overhead to be small, even if you play 100 notes a second.
However, I haven't measured instantiation times, and there certainly is
some overhead. We are still talking about standard block-based processing,
though. Yes, sample accurate timing is implemented: when a plugin is run
it is given start and end sample offsets.
Hmm, that might have sounded confusing, but I intend to write a full
account of MONKEY's architecture in the near future.
You could think of our API as...
It seems to be a solid design so far. I will definitely comment on it when
you have a first draft for a proposal.
--
Sami Perttu "Flower chase the sunshine"
Sami.Perttu(a)hiit.fi
http://www.cs.helsinki.fi/u/perttu