Re: [linux-audio-dev] Plugin APIs (again)

4 Dec 2002

...
  Well, a guaranteed unique ID is really rather handy
when you want to
 load up a project on another system and be *sure* that you're using
 the right plugins... That's about the only strong motivation I can
 think of right now, but it's strong enough for me. 
Ok, I see your motivation for this.  I hate the idea of 'centrally assigned'
anythings for something as open as this.  I'll think more on it..
...
  IMHO, plugins should not worry about whether or not
their outputs are
 connected. (In fact, there are reasons why you'd want to always
 guarantee that all ports are connected before you let a plugin run.
 Open inputs would be connected to a dummy silent output, and open
 outputs would be connected to a /dev/null equivalent.) 
I disagree with that - this is a waste of DSP cycles processing to be sent
nowhere.
...
  as some kind of interface to the plugin, but to me, it
seems way too
 limited to be of any real use. You still need a serious interface  
if it has no other use than 'ignore this signal and spare the CPU time' it
is good enough for me.
...
  just like you program a studio sampler to output stuff
to the outputs
 you want. This interface may be standardized or not - or there may be
 both variants - but either way, it has to be more sophisticated than
 one bit per output. 
Ehh, again, I think it is simpler.  Lets assume a simple sampler.  It has a
single output with 0 or more channels (in my terminology).  If you load a
stereo sample, it has 2 channels.  A 5.1 sample has 6 channels.  Let's
consider an 8-pad drum machine.  It has 8 outputs each with 0-2 channels.
Load a stereo sample, that output has 2 channels.  Now, as I said, maybe
this is a bad idea.  Maybe it should be assumed that all outputs have 2
channels and mono gets duplicated to both or (simpler) LEFT is MONO.
What gets confusing is what we're really debating.  If I want to do a simple
stereo-only host, can I just connect the first pair of outs and the plugin
will route automatically?  Or do I need to connect all 8 to the same buffer
in order to get all the output. In the process of writing this I have
convinced myself you are right :)  If the host does not connect pad #2, pad
#2 is silent.
...
  I think there should be as little policy as possible
in an API. As
 in; if a plugin can assume that all ins and outs will be connected,
 there are no special cases to worry about, and thus, no need for a
 policy. 
Slight change - a plugin only needs to handle connected inouts.  If an inout
is not connected, the plugin can skip it or do whatever it likes.
...
  Strongest resason *not* to use multichannel ports:
They don't mix
 well with how you work in a studio. If something gives you multiple  
I considered that.  At some point I made a conscious decision to trade off
that ability for the simplicity of knowing that all my stereo channels are
bonded together.  I guess I am rethinking that.
...
  Strongest reason *for*: When implementing it as
interleaved data, it  
bleh - I always assumed an inout was n mono channels.  The only reason for
grouping them into inouts was to 'bond' them.
...
  Like on a studio mixing desk; little notes saying
things like
 "bdrum", "snare upper", "snare lower", "overhead
left", "overhead
 right" etc. 
Should I use a different word than 'port'?  is it too overloaded with
LADSPA?
Hrrm, so how does something like this sound?
(metacode)
struct port_desc {
        char *names;
};
simple sampler descriptor {
        ...
        int n_out_ports = 6;
        struct port_desc *out_ports[] = {
                { "mono:left" }
                { "right" }
                { "rear:center" }
                { "rear-left" }
                { "rear-right" }
                { "sub:lfe" }
        };
        ...
};
So the host would know that if it connects 1 output, the name is "mono", and
if it connects 2 ouptuts, the names are "left", "right", etc.  Then it
can
connect "left" to "left" on the next plugin automatically.  And if you
want
to hook it up to a mono output, the user could be asked, or assumptions can
be made.  This has the advantage(?) of not specifying a range of acceptable
configs, but a list.  It can have 1, 2, or 6 channels.
another example:
drum machine descriptor {
        ...
        int n_out_ports = 16;
        struct port_desc *out_ports[] = {
                { "left:left(0):mono(0)" }, { "right:right(0)" },
                { "left(1):mono(1)" }, { "right(1)" },
                { "left(2):mono(2)" }, { "right(2)" },
                { "left(3):mono(3)" }, { "right(3)" },
                { "left(4):mono(4)" }, { "right(4)" },
                { "left(5):mono(5)" }, { "right(5)" },
                { "left(6):mono(6)" }, { "right(6)" },
                { "left(7):mono(7)" }, { "right(7)" },
        };
        ...
};
and finally:
mixer descriptor {
        ...
        int n_in_ports = -1;
        struct port_desc *in_ports[] = {
                { "in(%d)" }
        }
        int n_out_ports = 2;
        struct port_desc *out_ports[] = {
                { "left:mono" }
                { "right" }
        }
}
Or something similar.  It seems that this basic code would be duplicated in
almost every plugin.  Can we make assumptions and let the plugin leave it
blank if the assumptions are correct?
In thinking about this I realized a potential problem with not having bonded
channels.  A mixer strip is now a mono strip.  It seems really nice to be
able to say "Input 0 is 2-channels" and load a stereo mixer slot, "Input 1
is
1-channel" and load a mono mixer slot, "Input 2 is 6-channel" and load a
5.1
mixer slot.
I'm back to being in a quandary.  Someone convince me!
...
  Point being that if the host understands the labels,
it can figure
 out what belongs together and thus may bundle mono ports together
 into "multichannel cables" on the user interface level. 
This is what the "inout is a bundle of mono channels" idea does.
...
  Well, I don't quite understand the voice_ison()
call. I think voice
 allocation best handled internally by each synth, as it's highly
 implementation dependent. 
My ideas wrt polyphony:
* note_on returns an int voice-id
* that voice-id is used by the host for note_off() or note_ctrl()
* you can limit polyphony in the host
        - when I trigger the 3rd voice on an instrument set for 2-voices, I
        can note_off() one of them
* you can limit polyphony in the instrument
        - host has triggered a 3rd voice, but I only support 2, so I
        internally note_off() one of them and return that voice_id again.
        The host can recognise that and account for polyphony accurately
        (even if it is nothing more than a counter).
* note_off identifies to the host if a voice has already ended (e.g. a sample)
* note_ison can be called by the host periodically for each voice to see
if it is still alive (think of step-sequenced samples).  If a sample ends,
the host would want to decrement it's voice counter.  The other option is a
callback to the host.  Not sure which is less ugly.
I am NOT trying to account for cross-app or cross-lan voices, though a JACK
instrument which reads from a JACK port would be neat.
Side note:  is there a mechanism in jack for me to pass (for example) a
'START NOW' message or a 'GO BACK 1000 samples' message to a Jack port?
...
      a real problem so far, but I dot't like it. I
want *everything*
     sample accurate! ;-) 
Actually, our focus is slightly different.  I'm FAR less concerned with
sample-accurate control.  Small enough buffers make tick-accurate control
viable in my mind.  But I could be convinced.  It sure is SIMPLER. :)
...
   quite
conventient for things like strings and pads.   FL does
 Velocity, Pan, Filter Cut, Filter Res, and Pitch Bend.  Not sure
 which of those I want to support, but I like the idea. 
 "None of those, but instead, anything" would be my suggestion. I
 think it's a bad idea to "hardcode" a small number of controls into
 the API. Some kind of lose "standard" such as the MIDI CC allocation,
 could be handy, but the point is that control ports should just be
 control ports; their function is supposed to be decided by the plugin
 author. 
I've contemplated an array of params that are configurable per-note.  Not
everything is.  What if we had something like
struct int_voice_param {
        int id;
        char *name;
        int low;
        int high;
};
and specify an array of them.  The host can use this array to build a list
of per-note params to display to the user.  This starts to get messy with
type-specific controls.  Perhaps this info belongs as part of the control
structure.  Yes, I think so.
...
  Should be handled on the UI level, IMHO. (See above.)
Doing it down
 here only complicates the connection managment for no real gain. 
I want to ignore as much of it as possible in the UI.  I want to keep it
simple at the highest level so a musician spends his time making music, not
dragging virtual wires.  Ideally if there is a stereo instrument and I want
to add a stereo reverb, I'd just drop it in place, all connections made
automatically.  If I have a mono instrument and I want a stereo reverb, I'd
drop the reverb in place and it would automatically insert a mono-stereo
panner plugin between them.
...
  Yeah... But this is one subject where I think
you'll have to search
 for a long time to find even two audio hackers that agree on the same
 set of data types. ;-) 
I think INT, FLOAT, and STRING suffice pretty well.  And I MAY be convinced
that INT is not needed.  Really, I prefer int (maps well to MIDI).  What
kinds of knobs need to be floats?
...
  Just a note here: Most real instruments don't have
an absolute start
 or end of each note. For example, a violin has it's pitch defined as
 soon as you put your finger on the string - but when is the note-on,
 and *what* is it? I would say "bow speed" would be much more
 appropriate than on/off events. 
I'd assume a violin modeller would have a BOWSPEED control.  The note_on()
would tell it what the eventual pitch would be.  The plugin would use
BOWSPEED to model the attack.
...
  Well, yes. There *has* to be a set of basic types that
cover
 "anything we can think of". (Very small set; probably just float and
 raw data blocks.) I'm thinking that one might be able to have some
 "conveniency types" implemented on top of the others, rather than a
 larger number of actual types. 
I agree - Bool is a flag on INT.  File is a flag on String.
...
  Dunno if this makes a lot of sense - I just have a
feeling that
 keeping the number of different objects in a system to a functional
 minimum is generally a good idea. What the "functional minimum" is
 here remains to see... 
With this I agree.  One of the reasons I HATE so many APIs is that they are
grossly over normalized.  I don't need a pad_factory object and a pad object
and a plugin_factory object and a parameter object and an
automatable_parameter object and a scope object...  I want there to be as
FEW structs/objects as possible.
That said, one I am considering adding is a struct oapi_host.  This would
have callbacks for things like malloc, free, and mem_failure (the HOST
should decide how to handle memory allocation failures, not the plugin) as
well as higher level stuff like get_buffer, free_buffer, and who knows what
else.  Minimal, but it puts control for error handling back in the hands of
the host.
...
  Yeah, I know. It's just that I get nervous when
something tries to do
 "everything", but leaves out the "custom format" fallback for cases
 that cannot be forseen. :-) 
We're speaking of controls here.  In my mind controls have three
characteristics.  1) They have to specify enough information that the host can
draw a nice UI automatically.  2) They are automatable (whether it is sane
or not is different!).  3) They alone compose a preset.  What would a
raw_data_block be?
...
  Well, you can put stuff in external files, but that
seems a bit risky
 to me, in some situations. Hosts should provide per-project space for
 files that should always go with the project, and some rock solid way
 of ensuring that 
I don't really want the plugins writing files.  I'd rather see the host
write a preset file by reading all the control information, or by the host
calling a new char *oapi_serialize() method to store and a new
oapi_deserialize(char *data) method to load.
...
  should be used in any way in the API. (Although people
on the VST
 list seem to believe MIDI is great as a control protocol for plugins,
 it's never going into another plugin API, if I can help it... What a  
:)  good good
...
  So do I. I'm just trying to base my examples on
well known equipment
 and terminology. 
Honestly, I don't know all the terminology.  I have never worked with much
studio gear.  Most of what I have done is in the software space.  So I may
be making mistakes by that, but I may also be tossing obsolete-but-accepted
notions for the same reason :)
[ state-machine header ... ]
Very interesting.  I actually like it very much.  I am going to have athink
on that.  It may be a better paradigm.
...
  Well, it's basically about sending structured data
around, with
 timestamps telling the receiver when to process the data. As an
 example, instead of calling
        voice_start(v, ev->arg1);
 directly, at exactly the right time (which would mean you have to
 split buffers for sample accurate timing), I do this:
        aev_send1(&v->port, 0, VE_START, wave);
 where aev_send1() is an inline function that grabs an event struct
 from the pool, fills it in and sends it to the voice's event port.
 The sender does nothing more about it for now; it just keeps
 processing it's entire buffer and then returns. Just as if it had
 been processing only audio buffers. In fact, the timestamped events
 are very similar to audio data in that they contain both actual data
 and timing information - it's just that the timing info is explicit
 in events. 
Interesting.  How important is this REALLY, though.  Let me break it into
two parts: note control and parameter control.  Note control can be tick
accurate as far as I am concerned :)  As for param control, it seems to me
that a host that will automate params will PROBABLY have small ticks.  If
the ticks are small (10-50 samples), is there a REAL drawback to
tick-accurate control?  I know that philosophically there is, but REALLY.
In the event model, if I want a smooth ramp for a control between 0 and 100
across 10 ticks of 10 samples, do I need to send 10 'control += 1' events
before each tick?
...
  Seriously, it's probably time to move on to the
VSTi/DXi level now.
 LADSPA and JACK rule, but the integration is still "only" on the
 audio processing/routing level. We can't build a complete, seriously
 useful virtual studio, until the execution and control of synths is
 as rock solid as the audio. 
Well, I really want to do it, so let's go.  You keep talking about
Audiality, but if we're designing the same thing, why aren't we working on
the same project?
Lots of ideas to noodle on and lose sleep on.  Looking forward to more
discussion
Thanks
Tim

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [linux-audio-dev] Plugin APIs (again)