[linux-audio-dev] more on XAP Virtual Voice ID system

David Olofson david at olofson.net
Thu Jan 9 01:20:01 UTC 2003


[Replying to some posts by Tim Hockin.]

On Wednesday 08 January 2003 23.55, Tim Hockin wrote:
[...]
> a voice has to be started.  Maybe not.  Maybe the synth can report
> that it has N voices on at all times.  Hmm.  Still, VOICE_ALLOC is
> akin to note_on.

Well, if a voice can be "started" and still actually both silent and 
physically deactivated (that is, acting as or being a dump control 
tracker) - then yes.

(A sensible API shouldn't make this illegal, and I can't really see 
how it could prevent it.)


> > Also, keep in mind that any feedback of this kind requires a real
> > connection in the reverse direction. This makes the API and hosts
> > more complex - and I still can't see any benefit, really.
>
> We already have a rudimentary reverse path.

Where? (The VVID entries? No, those are "private" to synths!)


[...]
> > > The only problem is that it requires dialog.
> >
> > That's a rather serious problem, OTOH...
>
> And that is what I am not convinced of - I'm not against VVIDs, I
> just want to play it out..

Well, unless I'm forgetting something, it's basicaly "just" the 
latency issue, which seems to make controlling remote synths 
impractical, if at all possible. (I think it needs further API logic 
to work. See below.)

Apart from that, I just think it's ugly having both hosts and senders 
mess with "keys" that really belong in the synth internals. Even 
having hosts provide VVID entries (which they never access) is 
stretching it, but it's the cleanest way of avoiding "voice 
searching" proposed so far, and it's a synth<->local host thing only.


BTW, we've already agreed on doing this for control addressing. (The 
cookies that replaced control indices.) The big difference is that 
that's not a sample accurate real time protocol.


> > > * no carving of a VVID namespace for controller plugins
> >
> > No, plugins have to do that in real time instead.
>
> No, plugins have a fixed namespace - they dole out real VIDs to
> whomever asks for them.

Well, yes. It's just that VIDs become illegal as voices are stolen, 
so synths have to remember which voices should not accept direct 
addressing. (Those should respond only to the temporary negative 
virtual VIDs, until the host starts using the real VID - whenever it 
find out about that. That's where >1 block latency becomes seriously 
troublesome - the *real* latency issue.)

I think that's really rather counter-intuitive. Isn't part of the 
point with handing out a VID that you generate yourself that you 
shouldn't have to check incoming VIDs all the time?

Further, I don't like the idea of forcing senders to virtualize VIDs 
internally, just to be able to use *both* VIDs they invent themselves 
(the negative ones) and "real" VIDs returned from the synths.


>  Are VVIDs global or per-plugin?  Sorry, I
> forget what we had decided on that..

For practical reasons, I think they're best handled as per-host. That 
is, they're "global" as long as you're only talking to local synths, 
but you really should think of them as *per-plugin* when you want to 
talk to synths. (That way, you can connect to remote synths 
transparently WRT VVIDs.)

So, officially:
	VVIDs are per plugin.

For host authors:
	VVIDs *can* be managed globaly for the local host.


> > >    /* find a vvid that is not currently playing */
> > >    do {
> > >         this_vvid = vvid_next++;
> > >    while (vvid_is_active(this_vvid);
> >
> > Again, this is voice allocation. Leave this to the synth.
>
> You have a pool of VVIDs.  Some of them are long-lasting.  Some are
> short lasting.  You always want to find the LRU VVID.  If there are
> available VVIDs, take the LRU free one.  If there are not, you
> either need to voice-steal at the host or alloc more VVIDs.  Right?

Well, what *actually* made me comment on that, is that I thought 
"vvid_is_active()" had something to do with whether or not the 
*synth* is using the VVID.

Was that the idea? If so; again; VVID entries are not for feedback; 
they simply do not exist to senders. They're a host provided VVID 
mapping service for synths; nothing else.

So, as to "allocating" VVIDs, it's nothing more than a matter of 
keeping "note contexts" apart. In many cases, you don't even need a 
VVID manager for that. All you have to guarantee is that there is one 
VVID for each note you want to control at any time.


On Thursday 09 January 2003 00.06, Tim Hockin wrote:
> > Yes, and that's my problem with it. Or rather; it's ok for synths
> > to be able to hint that they use controls this way, but designing
> > the voice addressing/allocation scheme around it has serious
> > implications.
>
> Yes, definitely hint.  Anal-retentive hosts will show it as an
> init-option only if it is flagged as such.  Synths still need to
> handle spurious and incorrect events (fall through I'd guess), but
> it is a nice hint to the UI.

They can handle it by actually doing what the hint suggests; sample 
the "initializer" control values only when a note is started. That 
way, they'll "sort of" do the right thing even when driven by data 
generated/recorded for synths/sounds that use these controls as 
continous. And more interestingly; continous control synth/sounds can 
be driven properly by initializer oriented data.


On Thursday 09 January 2003 00.49, Tim Hockin wrote:
> > Seriously, though; there has to be *one* of DETACH_VVID and
> > VOICE_ALLOCATE. From the implementation POV, it seems to me that
> > they are essentially equivalent, as both are effectively used
> > when you want to say "I'll use this VVID for a new context from
> > now on." VOICE_ALLOCATE is obviously closer to the right name for
> > that action.
>
> Agreed - they are semantically the same.  The question is whether
> or not it has a counterpart to say that init-time controls are
> done.

This is where the confusion/disagreement is, I think: I don't think 
of this event as "INIT_START", but rather as "CONTEXT_START". I don't 
see the need for a specific "init" part of the lifetime of a context. 
Initialization ends whenever the synth decides to start playing 
instead of just tracking controls.

When you want to play a note, you create a context (by grabbing a 
VVID and saying "VOICE_ALLOC" or whatever), and then you just send 
control events. Some control changes will get the synth to play 
sound, and some will instantly or eventually stop the sound. It's 
entirely possible that the synth will sometimes just track controls, 
using a fake voice, or that it will just disable most of the voice 
code, when no sound is to be generated. (It's really two ways of 
doing the same thing, only the former seems more "honest" in terms of 
actual polyphony vs CPU load, since "fake voices" wouldn't count 
towards polyphony.)

The context never *explicitly* ends, but you could say that it ends 
physically when the sender has reassigned the VVID and the synth has 
killed the voice. Thus, no need for a "VOICE_END" or similar event 
either.


> As for redundancy - I see it as minimum requirement. 
> Suppose I want to turn a voice on with no control changes from the
> default (no velocity, nothing).

Well, that's equivalent to a MIDI NoteOn with a velocity of 0. 
(NoteOff shorthand for running status, that is.) Why would you want 
to do that?

For a continous velocity instrument, this is obvious; default 
velocity has to be 0, or you'll have a "random" note started soon as 
a voice is allocated. Thus you *have* to change the velocity control 
to start a note.

For a "latched velocity" instrument (MIDI style), it's exactly the 
same thing. There will be a NOTE control, and it has to be "off" by 
default. Just set it to "on" to start a note with default velocity.


BTW, it might be a good idea to always provide NOTE output from 
senders, even though continous velocity and similar synths won't care 
about it. Then again, continous velocity data doesn't work very well 
as velocity data for note based synths, so it's probably not much 
use...

Something will have to do some transformations to make this work - 
and that's way beyond the API level. Analyze entire "notes" of 
continous velocity data and calculate a note-on velocity from that; 
that sort of stuff. Or just connect VELOCITY to the VOLUME input, and 
use the synth's default velocity for note-ons? Maybe...


> I need to send SOMETHING to say
> "connect this VVID to a voice".  The minimum required is a VOICE_ON
> or similar.

Yes. "VOICE_ALLOC" doesn't trigger a MIDIism allergy reaction for me 
at least,  but it's still a bit confusing...


>  For something that has no per-voice controls (e.g. a
> white-noise machine) you still need to send some event.

Fiddle with the NOTE control, maybe? (Which would default to "off", 
of course.)


> And I'd
> rather see the voice-on protocol be consistent for all instruments.

Me too, but controlling note based and continous control synths the 
same way is tricky...

Though, you *can* make the NOTE control continous, rather than a 
switch... Traditional synths would just check it for >0, while 
continous velocity synths would actually interpret it as the note 
velocity input.

So why not call it VELOCITY, then, and ditch NOTE?

Well, the main reason is release velocity for note based synths. If 
you have only a VELOCITY control, you can interpret the change from 0 
to >0 as "note on", and the actual value as the note velocity. Then 
you would just wait until VELOCITY becomes 0 again, which means "note 
off".

Now, how do we express release velocity with that scheme? Negative 
values? (Could make a continous velocity synth blow up!) Current 
VELOCITY value before VELOCITY goes 0? (Requires two VELOCITY events 
to implement note off with velocity, and that would make no sense to 
a continous velocity synth.)


Use both NOTE and VELOCITY controls (both continous), and have NOTE 
be the primary one? Continous velocity synths would normally not have 
a VELOCITY input, but would look at the value of NOTE. Other synths 
would look at both, but interpret NOTE as a switch, where >0 means 
"on".

Still broken, though: Release velocity doesn's fit in... :-/

Use a third control for that!? Actually, that makes quite some sense, 
since many real instruments use totally different mechanisms for 
starting and stopping notes. Pianos. Cymbals...

VELOCITY vs DAMPING, or something. (Makes sense for continous 
velocity instruments as well!)


> If that means you have to send two events (VOICE_ALLOC, VOICE_ON)
> for the white-noise maker, then I can live with that.

Well, yes. And you'll also have to send a third event (corresponding 
to VOICE_OFF) to stop it, since just "letting go" of a VVID doesn't 
affect anything. (It's a NOP, as far as the API and synths are 
concerned.)

If nothing else, *that* should indicate where this VOICE_ON really 
belongs - and I don't think it's anywhere near VVID/voice management. 
It's a Voice Control thing.


On Thursday 09 January 2003 00.55, Tim Hockin wrote:
[...]
> Synth needs to know where the table is - We'd need to do
> double-indirection so that the table can move if needed.
>
> struct voice *v = myvoices[(*(host->vvid_table))[vvid]];

Why double indirection? I think it's enough to say that you must 
never store the pointer across process() calls. Note that senders 
don't use the table, so it doesn't matter if they cause reallocation 
by asking for more VVIDs while processing. Event processors might 
have to worry about it, though.

On a sidenote, event processors may also want to be able to find out 
whether or not input VVIDs are compatible with the target of the 
output. Doesn't make much sense to manage your own VVIDs if you never 
need to allocate your own voices... (Transpose and delay plugins, for 
example.)
 

On Thursday 09 January 2003 00.59, Tim Hockin wrote:
[...]
> It's attractive to use special values for meanings.  The plugin
> could tell the host that a VVID is done by setting it to a value. 
> Same for errors.
> How is this different from a plugin sending event back to the host?

(See above. VVID entries are for synths only.)


>  And what of remote plugins?

Allocate VVIDs from the remote host.

Obviously, you are not allowed to assume that a VVID (the index) is 
globally unique. VVIDs for remote synths may well overlap with VVIDs 
for local synths, or remote synths on other hosts.


> Do they have shared-memory access to
> the VVID table or is it function-call based?

They have no access at all, except indirectly through the events sent 
to the synth. (And that's write-only.)


//David Olofson - Programmer, Composer, Open Source Advocate

.- The Return of Audiality! --------------------------------.
| Free/Open Source Audio Engine for use in Games or Studio. |
| RT and off-line synth. Scripting. Sample accurate timing. |
`---------------------------> http://olofson.net/audiality -'
   --- http://olofson.net --- http://www.reologica.se ---



More information about the Linux-audio-dev mailing list