[linux-audio-dev] PTAF link and comments

David Olofson david at olofson.net
Tue Feb 4 19:14:01 UTC 2003


On Tuesday 04 February 2003 20.56, Tim Hockin wrote:
> > Hello everybody,
> >
> > I'm one of the PTAF authors and I've just subscribed to
> > this list, seeing the standard is discussed here. I will
> > try to answer your questions and to collect feedback.
> > I apologize for this long mail.
>
> WELCOME!  I'm excited to see one of the more well-known plugin
> companies chiming in.  I'm also excited that XAP and PTAF are not
> too far from each other.  Hopefully we can reconcile our efforts
> and just have one API, and not two very similar ones.

Indeed.

BTW, inspired by a private discussion with Ron Kuper of Cakewalk, I'd 
just like to remind people that XAP itself is really secondary to our 
ultimate goal. We want a nice plugin API, and nice audio software for 
Linux, right?

What I'm saying is basically that it's not a goal in itself to create 
our own "standard" here. If we can get along with the GMPI group and 
something useful gets out of it, things will be better for everyone. 
What this would mean to XAP as such is uncertain, of course - but I 
think we will have *very* little influence on what will become The 
Audio Plugin API, unless we participate in the GMPI effort.

Now, I'm not dropping out of XAP regardless. If nothing else, a 
working, documented implementation is better than a bunch of design 
documents on more or less untested ideas.


[...]
> > Lastly, plug-in properties may depends on contextual data,
> > like OS or host versions, date, CPU, environment variable,
> > etc. - things sometimes useful for copy protections.
>
> hmm, I'm not sure I buy this - how does this matter?

A copy protected plugin may chose to run in "demo mode" if it thinks 
something is wrong. Then it might have additional restrictions or 
something...

Anyway, I don't really see what this has to do with the API. You can 
(and probably should!) implement that kind of stuff without relying 
on the host in anyway. The host could be cracked, or it could in fact 
be a custom cracking and/or reverse engineering tool! Can't trust 
anyone or anything.


> > word "Sequencer" is badly choosen, someone already told me
> > that too. If anyone has a better idea for its name...
>
> "host"?

Maybe - but still, what's the point? Unless you're supposed to deal 
with calls from multiple threads, the host thread (or rather, the 
audio thread; the one that the actual XAP_host struct belongs to) is 
the only one you'll ever be called from.


> > I'm presonally not familiar with Linux GUI toolkits (I'm
> > confused with Gnome, KDE, X, Berlin, etc, sorry for
> > my ignorance), is there a problem with them ? In this
> > case wouldn't be possible to launch the GUI module from
> > the plug-in ? What would you recommend ?
>
> Defining a proper cross-platform GUI system will be fun.  I see
> four levels of UI from plugins:
>
> 1) none - host can autogenerate from hints

This is prety much a requirement anyway, I would think. VST has 
explicit support for it, and LADSPA provides all the info you need to 
do it. Likewise with XAP.

This part is mostly done, right? (Well, we don't have a real host yet, 
so we obviously don't have an implementation.)


> 2) layout - plugin provides XML or something suggesting it's UI,
> host draws it

Basically like 1), but with some extra info, right?


> 3) graphics+layout - plugin provides XML or something as well as
> graphics - host is responsible for animating according to plugin
> spec

Cool, but hard to get right, I think. At some point, someone will yell 
"scripting!" - and then it's Tcl/Tk all over again, although with 
more chrome... Maybe it could be cool, but it's a BIG thing to hack. 
It would basically have to solve 99.99% of all GUI needs to be 
justified.

Well, unless we can just whip something almost that good up in no 
time, from existing, highly portable and suitably licensed stuff. 
Maybe if libvstgui is released under the LGPL, and we hack a binding 
for some nice and easy to learn language...?


> 4) total - plugin provides binary UI code (possibly bytecode
> or lib calls)

Byte code? Pretty close to what I suggested for 3. And the same 
issues, of course. The native code version is probably simpler, but 
obviously, not nearly as flexible.

Makes me think of the next stage of Audiality scripting, BTW... And 
that will have to be RT safe bytecode, though that doesn't really 
matter in this context. (Anyone know about some nice LGPL VM I could 
hack? :-)


> > One solution is to make the host define implicit priority
> > levels for these clients, for example 1 = automation,
> > 2 = remote controls, 3 = main GUI. This is good enough
> > for parameter changes arriving simultaneously, but is
> > not appropriate for "interleaved" changes.
>
> Actually, I prefer this situation to appear broken - you should NOT
> have interleaved control sources.  Tokens need to be forecefully
> grabbed with that model.  If I see a knob that is turning via some
> old automation, I should be able to grab it in the UI and take
> over.  It is not clear to me, however, whether the host should
> remember that and leave control with the highest prio controller,
> or just make it temporary.  This needs more thought.  It may not be
> the domain of the API at all, but the host.

It has to be supported by the API, or you can't tell whether a GUI is 
"freewheeling", or if the user is just holding a knob still, 
expecting the automation to record that.

Thus the active/passive feature for control outputs. Any suggestions 
for how to implement it? Can't think of anything nice and sexy... 
Could use an extra event, but then you have twice as many control 
related events, all of a sudden; *two* of them! ;-)

Actuall, no. We'll need one RAMP or SET event for each control 
datatype. (Either or, depending on whether the datatype can be ramped 
or not. It obviously silly to call the integer and string events 
"RAMP", if the duration field will always be ignored and assumed to 
be 0. I think...) Anyway, point being that you need only one ACTIVE 
event. It would work for all control datatypes.

Oh well. Better ideas, anyone?


[...]
> > > * I think the VST style dispatcher idea is silly
> >
> > But it's the easiest way to :
> >
> > 1) Ensure the bidirectional compatibility across API versions
>
> I agree that it is simplest for host developers.

I wouldn't even agree on that... It may *look* simpler, but it 
basically means you have to check every function return, and take 
alternative actions *after* trying your favourite approach. You can 
cache your findings if you like, but that doesn't exactly *simplify* 
anything.

Why not check the version code and all calls a plugin of that version 
may have, and then pick your wrapper, configure your "plugin 
manager", or what have you?

If you want it *real* simple, you might even have a "quick hack host 
SDK" that fills in unimplemented plugin calls with suitable fakes or 
emulation calls, to make your dead simple, no conditionals host 
happy. (That's basically what I'm doing with all those process() 
variants in Audiality. Oncy a plugin is initialized, all of them are 
guaranteed to do what you want, one way or another.)


> Do we want to
> make life simpler for host developers or plugin developers.  I've
> advocated staying simple in the past on this subject.

And I still agree. There won't be as many hosts as plugins, and we can 
hack a simplified host SDK, if people think our requirements on the 
host are too much. (Not that I find that very likely, considering 
what host authors on Windows and Mac OS have to deal with.)


> I could be convinced that a slimmed down system that was somewhat
> in the middle of our two proposals is best.  I don't like using
> variable arguments in an API, however.
>
> plug->run_func(enum cmd func, ...);
>
> That scares me as a source for big bad runtime bugs, and just
> passing a void pointer doesn't naturally handle all cases.

Agreed. Type checking in C is poor enough as it is, without asking for 
more trouble.


[...]
> > > * UTF-8, rather than ASCII or UNICODE.
> >
> > Actually it IS Unicode, packed in UTF-8. This is the
> > best of both worlds, because strings restricted to the
>
> Actually, I don't see much point in anything complicated. 
> Localization should be separated from the plugin.

How can it be, if you want it to work for patch names and the like? In 
fact, most platforms even support it in file names these days, so 
you're not really getting away from it, no matter what.


> I much prefer
> something like getttext().  The plugin provides keys (generally an
> english string) and the key is looked up for translations.  It
> requires very simple processing by the host, and frees the plugin
> from worrying about it.

Right, but that solves only part of the problem. You have to be able 
to deal with "weird characters" in plain strings as well, including 
paths and file names.


> UTF8 is fine because it tastes like ASCII
> to me, so I don't have to deal with it.  I don't know if it is
> useful to spec UTF8

It's probably useful to any host that wants to display stuff 
correctly. That won't work unless you can safely assume that all 
strings meant for display are UTF8.

As to plugins, this is exactly the point: Pretend it's plain ASCII and 
don't worry about it.


> > > * Hosts assume all plugins to be in-place broken. Why?
> > > * No mix output mode; only replace. More overhead...
> >
> > 1) Copy/mix/etc overhead on plug-in pins is *very* light
> > on today's GHz computers. Think of computers in 5, 10
>
> you'll get a lot of arguments about that on this list - linux
> people tend to have a 486 or pentium stashed somewhere. :)

Hmmm... My last Pentium mobo gave up a few weeks ago. My slowest 
machine right now is probably that 300 MHz Geode GX1 SBC.

Anyway, the fastest machine I have here is this P-III 933 - and with 
PC133 SDRAM, I can only say that hitting memory hurts, and you have 
to do quite some work per sample to avoid being memory bound.

And I'm probably overestimating the new 333 and 400 MHz DDR memories 
used in P-4 systems, having done little more than playing RTCW with a 
FireGL 8800 card on such a machine. (*That* runs fast, at least. :-)

That is, if anyone's in doubt, I'm back at my old position; we should 
at least have an adding version.

We could do like VST and make the replacing version optional... Better 
than making the adding version optional, since faking the replacing 
version doesn't involve any extra buffers.


> > 2) This is the most important reason: programmer failure.
>
> This I might buy.  I'm really unsure.

Well, copy/paste is dangerous - but OTOH, you can pull something like 
what I do with the voice resamplers in Audiality: Including the same 
file a number of times with different #defines as "arguments". I have 
40 (!) different versions, but only really code for 5...

Also, if you split up your plugin in inline functions, the actual 
callback functions become really rather small, making copy/paste 
maintenance much easier. You shoud be able to fit several versions in 
one screen that way. Obvious splits are to turn event processing and 
audio processing into two inline funcs, since they're on different 
"loop levels". After that, there's just the ingress code of the 
callback, basically, and you can deal with that by having a third 
inline function filling in "local variables" in a local struct and 
passing it by reference to the inline functions. Finally, the 
obvious: The output stage would be the fourth inline, of which you 
have two versions. You should get something like this:

	int process(XAP_plugin *plugin, ...)
	{
		MY_locals locals;
		init_stuff(&locals, plugin, ...);
		while(process_some_events(&locals))
		{
			int i;
			for(i = 0; i < locals.fragment; ++i)
			{
				process_audio(&locals);
				output_audio_mixing(&locals);
			}
		}
		return locals.result;
	}

and perhaps

	int process_replacing(XAP_plugin *plugin, ...)
	{
		MY_locals locals;
		init_stuff(&locals, plugin, ...);
		while(process_some_events(&locals))
		{
			int i;
			for(i = 0; i < locals.fragment; ++i)
			{
				process_audio(&locals);
				output_audio_replacing(&locals);
			}
		}
		return locals.result;
	}


[...]
> > 3) Why no in-place processing ? Same reasons as above,
[...]
> I'm leaning more towards the idea that if a plugin can do in-place
> processing, it may be asked to, but it should have the ability to
> specifiy for itself.

Yeah. Just a hint that can be ignored, if "in-place broken" is 
assumed.


> > bugs when coding even simple programs. Final user will
> > attach value to reliable software, fore sure.
>
> Agreed - but they also attach value to efficient software.

Yes...


[...]
> > I think about replacing all these redundant functions
> > by just one, callable only in Initialized state. It
> > would be kinda similar to the audio processing opcode,
> > but would only receive events, which wouldn't be dated.
>
> We've just spoken of calling the process() function with a duration
> of 0 samples.  Events are then processed immediately, and no second
> API is needed.

Good point. I almost forgot that you can use that feature for this as 
well...

It might be important to know that this changes the rules for 
process*() calls slightly. You actually may get process*() calls even 
when you're not supposed to process audio. The rule essentially 
becomes "You will not be asked to process audio unless you're in the 
active state."


> > > * Ramping API seems awkward...
> >
> > What would you propose ? I designed it to let plug-in
> > developpers bypass the ramp correctly. Because there are
> > chances that ramps would be completly ignored by many
> > of them. Indeed it is often preferable to smooth the
> > transitions according to internal plug-in rules.
>
> Unlike some people here (I haven't gotten into it yet with them :)
> I think ramping is an add-on to discrete values.  I think all
> controls should support discrete SET operations, and RAMP
> opertaions are an optional extension that the host can use IFF the
> control says it supports it.

Problem with that is that then you *have* to make connections through 
converter plugins, or through the host, unless both ends agree. That 
is, we get two physically incompatible control types; ramped controls 
and discrete controls.

If you think of it this way: SET gets an extra argument 'duration' 
that you would "normally" set to 0. *If* you use another value, the 
event effectively becomes a RAMP event.

Or as I've previously expressed it; define RAMP with a zero duration 
to be a SET operation. Slight special case, but if you actually try 
to implement it, you realise you need it anyway to get ramping to 
work properly without specialcasing ramps that get rounded down to 0 
duration.

Now, whether we should call that single event RAMP or SET is open. I 
would suggest CONTROL instead, as that's less specific and thus, less 
missleading. Besides, it has the advantage of appearing to be related 
to controls - which is very true. :-)


> Internally, all SET operations should
> be ramped, even if it is a fast linear ramp.

Yes. DSP implementation thing.


> > > * It is not specified whether ramping stops automatically
> > > at the end value, although one would assume it should,
> > > considering the design of the interface.
> >
> > Hmmm... I start to see your point. You mean that it
> > is impossible for the plug-in to know if the block ramp
> > is part of a bigger ramp, extending over several blocks ?
>
> Right and wrong.  We've agreed that a RAMP operation is a target
> and duration, but that does not imply a stop.  A control must reach
> the target at (or slightly before) the end-point.  Beyond that, it
> is undefined.  It may keep ramping, or stop, or whatever.  The host
> should send a SET event (which cancels any pending ramps) to stop a
> ramp.  Now it may instead be useful to just say that ramping WILL
> continue, and get rid of the undefined BS.

Yeah. It's just that it really *is* undefined, if you consider 
practical implementations.

For example, if you approximate the ramping of filter coefficients 
with 2nd or 3rd degree polynomials, you can get a reasonably linear 
ramp of the filter characteristics. But what happens when you run 
outside the range for which you calculated the coefficients...?


[...]
> > > * Why [0, 2] ranges for Velocity and Pressure?
[...]
> ick - if there is a balanced (or even unbalanced) spectrum of
> values, center it at 0, please :)

That's another idea - [-1, 1] makes as much sense as [0, 1] in many 
cases.

In short, I'm not very enthusiastic about this. I like the LADSPA 
approach better.


[...]
> > > * Why have normalized parameter values at all?
> >
> > Normalized parameter value is intended to show something
> > like the potentiometer course, in order to have significant
>
> I rather like normalized values.  I've been contemplating this
> problem, myself.  I don't like having to call normalize functions
> for every translation, though.  Can the translation overhead be
> minimized?

Consider this:

	* Controls that are meant to be connected generally
	  have the same units and the same ranges.

	* If they *do* have different ranges, you end up in
	  an interesting situation: What's the correct way
	  to deal with it? Nothing says stretching the ranges
	  to map 1:1 is the right thing to do, and in many
	  cases, it's definitely wrong.

So, you can't really avoid conversions in between plugins anyway, if 
you start connecting things that aren't explicitly made to fit. If 
you don't, you don't have to do any conversions, whether you have 
normalized parameters or not.


> > > * The "save state chunk" call seems cool, but what's
> > > the point, really?
> >
> > This is not mandatory at all, there is a lot of plug-in
> > which can live without. It is intended to store the
> > plug-in instance state when you save the host document.
>
> I actually have this in my own notes for XAP - an optional way for
> the plugin to save 'extra stuff' with a preset or document.

How about combining a few "old" features here:

	* String and raw data controls.

	* An output must initialize the connected input
	  as soon as audio processing continues.

	* Calling process() for zero frames is legal,
	  and means "process only events for the
	  current frame".


Now, give one or more row data control outputs a suitable type hint, 
such as "INSTANCE_STATE", so hosts can just connect them, call 
process() for 0 frames and store the data somewhere. No extra calls 
or anything needed, and it "accidentally" works exactly like querying 
GUI plugins for their current state. (You'd do that in cases where 
you don't have an automation sequencer in between the GUI and DSP 
plugin and still wath to store states, or when you don't want to have 
the sequencer analyze all traffic from GUI plugins.)


//David Olofson - Programmer, Composer, Open Source Advocate

.- The Return of Audiality! --------------------------------.
| Free/Open Source Audio Engine for use in Games or Studio. |
| RT and off-line synth. Scripting. Sample accurate timing. |
`---------------------------> http://olofson.net/audiality -'
   --- http://olofson.net --- http://www.reologica.se ---




More information about the Linux-audio-dev mailing list