[linux-audio-dev] Plugin APIs (again)

David Olofson david at olofson.net
Sun Dec 8 17:12:01 UTC 2002


On Sunday 08 December 2002 06.00, Tim Hockin wrote:
[...]
> Aside: I propose the following for naming stuff:
>
> * Plugin: a chunk of code, loaded or not, that implements this API
>   (a .so file or a running instance).
> * Instrument: an instance of a plugin that supports the instrument
>   API bits (analogous to a chunk of hardware from Roland).
> * Channel: a 'part', 'channel' or 'instrument' from MIDI. 
>   Instruments have 1 or more Channels.
> * Control: a knob, button, slider, or virtual control withing a
>   Channel or Instrument.  Controls can be master (e.g. master
>   volume), per-Channel (e.g. channel pressure) or per-Voice (e.g.
>   aftertouch).
> * Preset: a stored or loaded set of values for all Controls in a
>   Channel
> * Voice: a playing sound within a Channel.  Channels may have
>   multiple voices.
> * Port: an input/output connection on a Plugin.  Audio data goes
>   over ports.

I totally agree with this. I'd just like to add that, depending on 
the API design, Ports might come in four kinds total:
	* Audio Input Ports
	* Audio Output Ports
	* Event Input Ports
	* Event Output Ports


> > I don't think this has much to do with the API - and I don't see
> > the possibility of a clash. If you take any two controls, they
> > may have any relation whatsoever, *defined by the synth*, and the
> > host should
>
> I agree with that IF we say you have to have a separate control
> struct for per-voice and per-part controls.  If we define a control
> and say it has a flag which identifies it as MASTER, PART, VOICE,
> then we do need to suggest a relationship.  I like defining a
> control once, but it's becoming more obvious to NOT do that.  They
> range of valid values may be wildly different.

Of course. MASTER, PART and VOICE controls are completely separate 
things when it comes to numbers, valid ranges, internal handling and 
all that. You could say that's the "relationship" between them.

As to "control struct", I'm not sure what you mean. The way Audiality 
is designed, there is only one struct - the event struct - which is 
used for all communication through the event system. Inside that 
"protocol", where only the linked list and timestamp stuff is 
strictly defined, various parts of the engine define their own, 
higher level protocols, such as;

	* Low level voices (ie "oscillators") with these events:
		VE_START	Start playing waveform <arg1>.
		VE_STOP		Stop and free voice.
		VE_SET		Set control <index> to <arg1>.
		VE_ISET		Set interpolated control <index> to <arg1>.
		VE_IRAMP	Ramp interpolated control <index> to <arg1>
				over <arg2> frames.

	  VE_SET has one set of controls...
		VC_WAVE		Waveform index (latched on loop)
		VC_LOOP		looping on/off
		VC_PITCH	linear frequency
		VC_RETRIG,	retrig after N samples
		VC_RANDTRIG	random mod. to RETRIG
		VC_PRIM_BUS	target bus for ACC_VOLUME
		VC_SEND_BUS	target bus for ACC_SEND
		VC_RESAMPLE	Resampling mode

	  while VE_ISET and VE_IRAMP have their own set:
		VIC_LVOL	Left volume
		VIC_RVOL	Right volume
		VIC_LSEND	Left send
		VIC_RSEND	Right send

There's not conflict and no extra conditionals; just two sets of 
controls, each with their own event(s). (BTW, note that VC_WAVE and 
VIC_LVOL share the same actual enum value - but that's irrelevant, as 
they're never used as arguments to the same event type.)

Now, this is hardwired stuff, but it's no different with a "real" 
API. Just have one set of events for each class of controls. 
Obviously, that applies to the "negotiation" API as well; you need 
separate calls, or an extra argument to tell the control classes 
apart.


> Do we want to provide a suggested range of values for MIDI
> compatible controls.  For example Controls that are labelled
> CONTROL_AFTERTOUCH should expect values between 0 and 1.0, and the
> host can map MIDI onto that?

Something like that. I'd like to use an enum for "control kind" 
(corresponding to those "standard" MIDI CCs), and some hints and 
stuff to tell the host about range, linearity, absolute limits, 
sensible default/"no effect" value and that sort of stuff.

Or we could use strings, but I *definitely* prefer enums, since those 
can be checked at compile time. You can still have two-way binary 
compatibility. Just have a default case "I don't know what this is" 
in the host.


> Or 0 and 127, like MIDI?

No, no MIDIisms on that level, please. It's easy enough to throw a 
suitable factor into the int->float conversion.


> or 0 and
> MAX_INT and let the synth blow up when you pass a VELOCITY of
> 10000, and it expects 127?

Just like with LADSPA, the plugin will have to tell the host whether 
or not it has absolute upper and/or lower limits. In VST, the range 
is [0,1], period, but I think that's too restrictive. Scaling that 
range to whatever you need is fine, but that's not the point: Some 
controls just don't *have* sensible absolute limits - so why force 
plugin designers to "invent" some?


> > You're thinking entirely in keyboardist terms here. :-) You're
> > forgetting that some instruments really don't have anything like
> > attack and release in real life, but rather "bow speed", "air
> > speed" and that kind of stuff.
>
> umm, speed == velocity, no?

Yes, that's what I've been thinking every now and then all the time. 
However, it seems that the term "velocity" is generally thought of as 
"instantaneous velocity at the time of impact". MIDI brainwashing 
again; that's why I tend to think of "velocity" as a parameter rather 
than anything that could be a continous control...

Another reason would be that if you accelerate a drum stick up to a 
certain velocity, it'll bounce at impact - and that doesn't map all 
that well to one-way communication and "don't rely on short roundtrip 
latency!", does it? Force feedback MIDI controllers? :-)

Normally, plugins just do what you tell them. They don't change their 
own input controls. (Well, they *could*, but that would be pretty 
hairy, I think... Or maybe not. Anyone dare to try it? ;-)


> But I get your point, and I concede
> this point. Velocity is an event.

Continuous control...? ;-)


> > I think so, but I'm not 100% determined. (Audiality is still
> > mostly a monolith, and I'm not completely sure how to best split
> > it up into real plugins.)
> >
> > plugins from being multipart doesn't make much sense. Just
> > reserve one control/event port for the whole instrument/plugin,
> > and then allow the instrument/plugin to have as many other
> > control/event ports as it likes - just like audio in and out
> > ports.
>
> I had been assuming a single event-queue per instance.

So have I (when designing MAIA) - until I actually implemented 
Audiality. There *should* be a single per-instance event port - the 
Master Event Port - but you may optionally have one event port for 
each channel as well, if you have any use for per-channel events. You 
may well just use the Master Event Port for everything in a 
multichannel plugin, but then you're on your own when it comes to 
addressing of channels and stuff. (That would require more dimensions 
of addressing in each event, which means a bigger event struct, more 
decoding overhead etc.)


> Two
> questions:  why would a plugin (assuming not multi-timbral for now)
> have more than one event-queue?

It wouldn't have to. If there's only one Channel, Master and Channel 
events are effectively equivalent - so you won't need to tell them 
apart, right? Consequently, you can just have a single Master Event 
Port, and register your full range of events on that.

You *could* have a Channel Event Port for your single channel (if you 
even have something like that internally), but there's not much 
point, unless you really have two loops in your plugin, and want 
events for them delivered separately.


> Assuming we do support
> multi-timbral synths, is there an advantage to having events
> per-channel event-queues?

Yes, definitely. As it is in Audiality (which made me realize this) 
you could view each Channel as a plugin instance - and they do indeed 
have one event port (which implies one queue) per instance. Each 
Channel has a typical "MAIA style" loop:

	while(frames_left_to_process)
	{
		while(next_event->timestamp == now)
		{
			process_and_remove(next_event);
		}
		for(samples_left_to_next_event)
		{
			process_audio();
		}
	}

It just sits there for one buffer at a time, worrying only about it's 
*own* events and it's own audio, and that's it.

Now, if you would merge multiple Channels into one plugin instance, 
it would not make much sense to squeeze all events through the same 
port. You would only have to split them up internally in the plugin, 
or you would get a very inefficient loop structure, that effectively 
splits buffers for the whole synth for every single event.


> Do you envision each Channel getting the
> ->run() method seperately, or does ->run() loop? If it is looping
> anyway, is there an advantage to multiple event-queues?

run() (or process(), as I prefer to call it) should be called only 
once per buffer, and there should be only one per instance. That's 
the one major point with supporting multichannel at all; the *plugin* 
gets to decide how and when things are done for the whole set of 
Channels. If processing all channels in parallel is better in some 
places; fine, do so.

Besides, multiple run()/process() calls would mean more overhead than 
a single call + internal looping. You can probably disregard the 
actual function call overhead in most cases, but plugins may have a 
whole lot of work to do in each call before they can actually enter 
the loop.

Another thing to keep in mind is that a multichannel plugin may have 
a highly optimized master mixing/routing section hooked up to the 
Master Event Port. Where would you run that if you have one 
run()/process() call for each Channel? You would effectively end up 
with at the very least N+1 plugins for N Channels.

If the Master section is also involved before, or in the middle of 
the signal frow of Channels, you'll soon have countless little 
plugins, communicating through Audio and Event Ports, instead of one, 
potentially highly optimized "monolith" plugin.


> > > > ports. As of now, there is one event port for each Channel,
> > > > and one event port for each physical Voice. "Virtual Voices"
> > > > (ie Voices as
>
> I know you've done some exploring - do we really want one
> event-queue per voice + one for each channel + one for master?  or
> is one per instance simpler?

No, not one per *voice*. Voices have way too short lifetime, so it 
just wouldn't make sense to try and track them with actual objects on 
the API level. They should be considered Abstract Objects, that are 
only addessed through a field in Voice events that are sent to the 
Channel Event Port.

(Note that I actually do have one event port for each voice in 
Audiality - but that's just because it happens to be the simplest and 
most efficient way of controlling them. Implementation stuff.)


[...]
> > > I'm still not clear on this.  What plugin would trigger another
> > > plugin?
> >
> > Any plugin that processes events rather than (or in addition to)
> > audio. They're usually called "MIDI plugins".
>
> I can see controller plugins (It's another extension to this API,
> perhaps)

No. Let's not repeat Steinberg's mistake here as well.


> which deal with sending events to other plugins.  And I
> guess an arppegiator plugin could be a controller..

Yes. It could also listen to an audio track, extracting pitch 
information from it.


> > > If so, how do you reconcile that they
> > > will each have a pool of VVIDs - I suppose they can get VVIDs
> > > from the host, but we're adding a fair bit of complexity now.
> >
> > Where? They have to get the VVIDs from somewhere anyway. It's
> > probably not a great idea to just assume that plugins can accept
> > any integer number as a valid VVID, since that complicates/slows
> > down voice lookup.
>
> you don't send just ANY integer, you get the integer to id a voice
> FROM the plug.

You get a *range* of valid VVIDs from the host (which has to ask the 
plugin at some point), by asking for it, saying how many VVIDs you 
need. Plugins that want to reallocate internal tables to deal with 
the VVIDs will generally have to do this outside the RT thread, but 
plugins that use the tag/search approach (like Audiality as of now) 
don't even have to be asked; the host could maintain the VVIDs all by 
itself.


> But if VOICE_ON is an event, it gets much uglier. 
> Do let me see if I get this right:
>
> Host allocates the VVID pool (perhaps a bitmask or some other way
> of indicating usage)
> Host wants to send a VOICE_ON:
> 	allocates an unused VVID
> 	sends VOICE_ON with that VVID
> 	when it sends a VOICE_OFF or receives a VOICE_OFF it can mark that
> 	   VVID as unused again.

It seems like this would be a host global matter... What would be the 
point in that? All you need is a way to address Voices within *one 
Channel* - you don't have to worry about other Plugins, or even other 
Channels on the same Plugin. (For table based implementations, 
plugins have one table per Channel, and for tag/search 
implementations, you just tag Voices with VVIDs *and* Channel 
indices.)


> Plugin wants to send a VOICE_ON:
> 	calls host->get_vvids(32)
> 	host allocates it 32 VVIDs

You do this when the host tells you to connect to an event port, and 
what you do is actually:

	first_id = host->get_vvids(target_port, number_of_VVIDs);
	if(first_id < 0)
		give_up;

Note that the first_id thing is actually *only* needed when the host 
connects multiple output ports to the same input port. Normally, it 
will be 0.


> 	sends VOICE_ON with any of those, and must manage them itself
> 	(alternatively, management is simpler if it just allocates one
> VVID at a time and frees it when it is done)

Definitely not. That would mean at least two function calls for every 
VOICE_ON event... *heh* (Meanwhile, Audiality event allocation, 
sending and handling macros will only ever make a function call when 
the Event Pool is empty - and that's normally a critical error. Each 
common event operation is only a matter of a few CPU instructions.)


> Or does the HOST ask the INSTRUMENT for VVIDs?  If so, why? 
> Explain? :)

The host will at some point ask the plugin, or rather *tell* the 
plugin how many VVIDs are needed for each Event Input port it wants 
to connect. Plugins may say

	"I don't have Voices!"
		(No need for VVIDs...)

or

	"Anything that fits makes a fine VVID for me."
		(Full 32 bit range, no actual allocation needed.

or

	"I need internal tables. How many VVIDs do you need?"
		(Host says N, and plugin allocates tables for N
		VVIDs, addressed as 0..N-1.)

The last case would be the most interesting one, since it requires 
that plugins allocate memory whenever the host needs more VVIDs. That 
is, if plugins do this themselves using malloc() or something, you 
simply cannot connect to Event Ports without taking the plugin out of 
the RT engine - *unless* you decide on a number of VVIDs to allocate 
for each Channel of every plugin, right when they're instantiated.

A third solution would be to have the *host* allocate memory for the 
tables needed, and just pass it to the plugin for initialization. Or 
even better (so you don't have to run the table init code in RT 
context), just have a Channel flag that tells the host that calling 
plugin->allocate_vvids() is not RT safe, but *is* thread safe. The 
host can safely call plugin->allocate_vvids() for a Channel that is 
not in use, while the plugin is running in the RT thread.


[...]
> > > host has to handle syncronization issues (one recv event-queue
> > > per thread, or it has to be plugin -> host callback, or
> > > something), but that's OK.
> >
> > Well, if your threads are not in sample sync, you're in
> > trouble...
>
> I meant threading sync issues.  Host can't just pass one
> event-queue to all plugins in a multi-thread environment.  But
> again, that is the HOSTs problem.

Yes. If the event pool is part of the host struct (which I don't even 
have one in Audiality - yet! *heh*), you're safe. All you have to do 
is put proxy ports in between when connecting plugins accross the 
engine threads, so you never touch ports that might be read by 
someone else at the same time.

Getting that to work without events arriving late is less trivial, 
though! :-) (When you have more than one thread, plugin execution 
order is no longer guaranteed, obviously - unless you enforce it by 
having blocking or spinning synchronization at critical points in the 
net or something.)


> So the host can pass the address
> of it's event-queue during activate(), or the plugin can make a
> host->send_event() callback.

Well, the host's event queue is the smallest problem. Getting plugins 
to run properly in more than one thread is the big deal - but if that 
works, reply events to the host is trivial. Just put the repyl event 
port in the host struct as well, and have one host struct for each 
engine thread.


//David Olofson - Programmer, Composer, Open Source Advocate

.- The Return of Audiality! --------------------------------.
| Free/Open Source Audio Engine for use in Games or Studio. |
| RT and off-line synth. Scripting. Sample accurate timing. |
`---------------------------> http://olofson.net/audiality -'
.- M A I A -------------------------------------------------.
|    The Multimedia Application Integration Architecture    |
`----------------------------> http://www.linuxdj.com/maia -'
   --- http://olofson.net --- http://www.reologica.se ---



More information about the Linux-audio-dev mailing list