[linux-audio-dev] XAP and these <MEEP> timestamps...

David Olofson david at olofson.net
Sat Dec 14 15:06:01 UTC 2002


On Saturday 14 December 2002 16.20, Frank van de Pol wrote:
> ~
> Good thinking David,
>
> with a small set of events (tempo updates and position updates) you
> can maintain a real-time view of the tempo map for all plugins.
> This is the same concept as used in the ALSA sequencer API.
>
> Since the plugins require prequeueing (because of the processing in
> blocks, and to compensate lag caused by the plugin or
> processing/transmission delays), and this prequeueing includes the
> tempo/position changes you'll get into an interesting situation
> where events scheduled for a certain tick and events schedulued for
> a certain time (=sample) might be influenced by the tempo/position
> change events. In the ALSA sequencer this is solved by allowing
> events to be scheduled at either a specific time or specific tick
> (using two queues).

But you queue ahead, and there are no "slices" or similar, are there? 
Since there are no slices, and thus no unit for a fixed latency, you 
*must* queue ahead - or you might as well use the rawmidi interface 
and SCHED_FIFO.
 
Audio plugins work with one "time slice" at a time. What has happened 
with the timeline during a time slice (one block) is already decided 
before you call the first plugin in the graph. Nothing must be 
allowed to change during the processing of one slice. It is in fact 
too late as soon as the host decides to start processing the block.

Same thing as passing a block to the audio interface - when it's 
away, it's too late to change anything. The latency is something we 
have to deal with, and there is no way whatsoever to get around it. 

(All we can do is reduce it by hacking the kernel - and that seems to 
be pretty damn sufficient! :-)


You *could* have someone else prequeue future events and send them 
back to you at the right time, but then you would need to say:

	* When you want the events back
	* What the timestamps are related to (real or timeline)

And if the timestamps are related to the timeline:

	* What to do in case of a transport stop or skip


Indeed, this is doable by all means, but it's a *sequencer* - not a 
basic timestamped event system. Throw a public domain implementation 
into the plugin SDK, so we don't have to explain over and over how to 
do this correctly.

Yes, it should be in the public domain, so people can rip the code 
and adapt it to their needs. There are many answers to that third 
quesion...

OTOH, if people want to keep their source closed, you could say they 
*deserve* to reimplement this all over! ;-)


> A nice question to ask is 'what is time'. I suppose that there is a
> direct correlation between time and sample frequency; but what to
> do with non-constant sample frequency? (This is not a hypothetical
> situation, since a sampled system which is synchronised to an
> external source, could be facing variable sample rate - when slaved
> to a VCR for instance).

Exactly. This is what Layla does, for example.

I have thought about whether or not plugins should know the exact 
sample rate, but I'm afraid it's too hard to find out the real truth 
to be worth the effort. You can probably ask Layla what the *current* 
exact sample rate is, but what's the use? Plugins will be at least a
few ms ahead anyway, so there's nothing much you can do about the 
data that is already enqueued for playback.

In short, if there is a sudden change, you're screwed, period.


> I believe the answer lies in definition of
> which is your time master, and use that as such; so in case of the
> slowed down VCR, the notion of time will only progress slower,
> without causing any trouble to the system.

Exactly. I think this is all that should be relevant to any plugins 
that are not also drivers, or otherwise need to directly deal with 
the relation between "engine time" and real world time.


> If no house clock or
> word clock is availble, things might end up hairy...

Only if you care about it. For all practical matters, you can assume 
the following:

	input_time = (sample_count - input_latency) / sample_rate;
	output_time = (sample_count + output_latency) / sample_rate;

If the sample rate changes at "sample_count", well it's already way 
too late to compensate for it, because your changes will appear on 
the output exactly at "output_time".

Now, you *could* do something about changes in the input sample rate, 
if you like, but I suspect that there isn't much use for that at all. 
If you want to record audio at the maximum quality, I'm afraid you 
just have to turn that wordclock sync off for a while, and let the 
HDR sync the recorded material during playback. (Or you should check 
what's the bl**dy matter with your flaky wordclock sync source! :-)


Need I point out (again?) that input_latency and output_latency are 
*ONLY* to be used when you deal directly with external things (MIDI 
drivers, audio drivers, meters and stuff in GUIs,...), where you need 
to know the "exact" relation between net time and real time?

Other plugins - be it audio or event processors - should never touch 
these. Latency is already compensated for by the driver/wrapper for 
each respective device your data may come from, or eventually receive 
at.

This is how VST does it, and there are no holes in it in that area, 
AFAICT.


> If for offline processing the audio piece is rendered (inside the
> XAP architecture), this can also be done in faster or slower than
> real-time depending on cpu power (I think this is a big bonus).

Yes. This is why "real" time should be completely irrelevant to 
plugins that talk only through Audio and Control Ports.

A MIDI driver *would* have to care, because it delivers events to an 
external device, which has to know exactly when - in real world time 
- the event is to be delivered.

As a normal plugin, you're in pretty much the same situation; you get 
an event which is to be delivered at a specific time - only the 
timestamp is directly or indirectly related to audio time. And 
obviously, in a real time situation, audio is in turn related to real 
time.


> mapping should be:
> sample position (using declared, not effective rate) -> time ->
> tick

What's "time" in here? Just map directly from sample position to 
tick. You must ignore real time, since the relation between that, 
audio time and musical time, is managed elsewhere.

During the processing of one block, sample position and musical time 
are in a fixed relation. Sample position is in a relation to real 
time that is controlled by the driver and hardware, and that - for 
all practical matters - means that audio time *is* actual time. (With 
some delay that you should very rarely care about.)


> for the mapping from time to tick a simple, real-time view of the
> relevant part of the tempo map is used;
> 	- time of last tempo change + tempo
> 	- time of last position change + tick  (also used for alignment)
>
> Since for some applications a relative time is most usefull, while
> for others's a absolute position is better, this is also something
> to look at. Position changes and events queued at absolute position
> are typically not a good match.

Applies to tempo changes as well, BTW.

Either way, if you don't queue events into the uncertain future, 
there is no problem. If there's a tempo or position change in the 
middle of the current block, well, what's the problem? Just do the 
maths.

Consider this: If you were a real time process with practically zero 
latency, why would you ever need to think about "unexpected" changes? 
The timeline would be *now* by all means, and all you have to do is 
check whether or not *now* is the time to perform a certain action. 
Whether you were just told about the action, or knew about it for a 
minute, is irrelevant, because *now* is the first time ever you can 
be absolutely sure what the right thing to do is.

The same applies to block based processing, only "now" is a time 
slice of non-zero duration, and the output you generate will be 
delayed by a fixed amount of time.


> If events are received from
> multiple sources, or are received out-of-order, they have to
> merged; special attention required for those cases.

Yes. This is what the host managed Shadow Queues are for. We don't 
want plugins to sort events all the time. The host knows when there 
are multiple sources, so let it deal with it *efficiently* when it 
happens.


> Same for
> applications that want to make changes to prequeued events (eg.
> withdraw those).

That's *exactly* why we should not prequeue events in the event 
system, unless we really need it. And we don't, because we can get 
away with just dealing with one block at a time. Prequeueing is 
basically just pushing the "problem" ahead of you.


> To me I get the feeling that the XAP consists on a set of APIs,
> each very focused and as simple as possible. Depending on the use
> cases for the application/plugin, one or more of this XAP-APIs can
> be used.

Yes, that's the idea, to the extent it's applicable.


> a quick thought brings a few to my mind; a further
> analysis would be required to complete the picture:
>
> 1 a XAP api for transport of the blocks of audio data

Audio Ports.


> 2 a XAP api for transport of the event data

Event Queues.


> 3 a XAP api for handling the time/tempo stuff (layered upon 1 & 2)

POSITION_START, POSITION_STOP, POSITION_CHANGE, TEMPO_CHANGE events.

Nothing much to do with 1, though, except for both having "sample 
frames" as the reference time unit.


> 4 a XAP api for handling musical events (notes etc.), layered upon
> (1, 2 & 3)

Well, those would be the controls. Dunno if you could say they are 
strictly musical, but at the time you get them, they *do* have a 
definite, fixed relation to the timeline they originated from.

It depends only on 3 *only* when plugins need to do more than just 
apply the Control changes right away. It builds on 2, and is not 
explicitly related to 1. (See above.)


> 5 a XAP api for the configuration/settings

This is layered upon 4, but depends only on 4 and 2. It is 
implemented entirely on the host side, and Plugins only have to 
implement 4 (and thus 2) to get 5.


> 6 a XAP api for the topology

For building graphs? Yes; but that's not part of the API, but rather 
the host SDK.

(You could say that about 5 as well, but OTOH, I think it's pretty 
nice with standard preset file formats...! *heh* So, 5 is a part of 
the XAP API, and there should be a more or less complete 
implementation in the host SDK lib.)


And then there's this Bay/Channel/Port/whatever mess... :-) Related 
to 1, 3 and 4, but depends on nothing - it's basically just a 
description of what the Plugin provides.


> etc. etc.
>
> just some thoughts,
> Frank.

Thanks for your comments! :-)


//David Olofson - Programmer, Composer, Open Source Advocate

.- The Return of Audiality! --------------------------------.
| Free/Open Source Audio Engine for use in Games or Studio. |
| RT and off-line synth. Scripting. Sample accurate timing. |
`---------------------------> http://olofson.net/audiality -'
.- M A I A -------------------------------------------------.
|    The Multimedia Application Integration Architecture    |
`----------------------------> http://www.linuxdj.com/maia -'
   --- http://olofson.net --- http://www.reologica.se ---



More information about the Linux-audio-dev mailing list