[LAU] hardware: recording voice and acc. guitar

Pieter Palmers pieterp at joow.be
Mon Jun 2 11:46:39 EDT 2008


Mark Knecht wrote:
> On Thu, May 29, 2008 at 4:32 AM, Pieter Palmers <pieterp at joow.be> wrote:
>> Mark Knecht wrote:
>>> On Wed, May 28, 2008 at 9:31 AM, Pieter Palmers <pieterp at joow.be> wrote:
>>>> Mark Knecht wrote:
>>> <SNIP>
>>>>> P.S. - I didn't understand your comment earlier about a globally
>>>>> available 1394b clock. I worked on that spec and I just don't remember
>>>>> that. Been too long I suppose...
>>>> I'm talking about the Cycle Timer Register that is globally available to
>>>> all
>>>> nodes. It's incremented by the cycle master at 24.576MHz, and all nodes
>>>> have
>>>> (approximately) the same view of this clock. All audio samples
>>>> transported
>>>> on the 1394 bus are timestamped relative to this cycle timer. This means
>>>> that every sample has an "absolute" time attached, no matter what device
>>>> it
>>>> is sent to. This enables the use of multiple devices, like you suggest,
>>>> without any form of external sync. It even beats wordclock sync since
>>>> that
>>>> only ensures relative sync (same rate), but still leaves an ambiguity in
>>>> the
>>>> exact absolute time a sample should have.
>>>>
>>>> I hope that refreshes things a bit :).
>>>>
>>>> Greets,
>>>>
>>>> Pieter
>>>>
>>> Ah, the refresh that helps my pausing brain. Thanks Pieter!
>>>
>>> Now, as a hardware design engineer, it's clear that the host receiving
>>> all this data from the bus can have an accurate picture of which each
>>> sample was sent over the bus. However if each source (in my example of
>>> 3 8-I/O units) is using it's own sample clock there is still the
>>> inaccuracy of each unit's 44.1KHz clock being slightly different. (One
>>> at 44,099, one at 44,100, and the 3rd at 44,101) It's that difference
>>> that causes me to always attempt to use a single hardware clock
>>> source.
>>>
>>> Do you know whether Presonus devices do anything to address this
>>> specifically?
>>>
>>> I currently use sync-to-ADAT, sync-to-spdif and Word Clock here. I've
>>> not done any syncing over my Fireware interfaces.
>>>
>> You have the exact timestamp of each sample, and you have the clock this
>> timestamp is referred to. Hence you can use these timestamps to drive a PLL
>> and generate the clock signal. Knowing the absolute time of a sample implies
>> rate sync (i.e. clock sync). Rate sync on it's own however does not give any
>> information on absolute timing.
>>
>> The simple way to think about it is to obtain the clock for the audio
>> streams from the pulses generated as follows:
>> 1) take the next sample
>> 2) take the timestamp from the sample
>> 3) wait for increase of CTR register (global firewire clock)
>>  3.1) if timestamp = CTR generate pulse and goto 1
>>  3.1) if timestamp != CTR goto 3
>>
>> This generates a pulse stream at exactly the rate of the audio data. Use
>> this to drive a PLL and you have rate sync. There is only one clock source
>> in the system, i.e. the actor that generates the timestamps on the data.
>>
>> Greets,
>>
>> Pieter
>>
> 
> Right, but I don't think that's enough. Let me exaggerate a bit and
> excuse me being so simplistic. It's for example only and has nothing
> to do with my respect for you and you're work. This is as much for
> other readers.
No offense taken.

> 
> Let's assume we have two different Presonus devices. One of the
> devices has a clock that's a little fast - let's say 50KHz instead of
> 44.1KHz while the second device has a clock that's a little slow,
> let's say 40KHz. Both of these audio sample clocks are free running
> and have nothing to do with each other. So what happens? After 5
> seconds have passed the first device has generated 250K sample while
> the second device has generated only 200K samples. If I put these two
> sample streams side by side in Ardour the second stream will need to
> use an extra 25% of it's samples to fill the same time period. when
> the two streams are then played the pitch of the second stream will be
> raised. (and/or the pitch of the first stream will be lowered.)
> 
> I will grant you that if someone wanted to resample both streams in
> real time using the time stamps provided in the 1394 stream then both
> could be brought closer together. Does the FreeBob stack do that?
> 
> Now, obviously the 50K/40K difference isn't reality. More realistic is
> +/-100 (expensive) or +/-200 PPM (fairly inexpensive) on the audio
> crystal. Assuming that I ran the calculator correctly (it's 6:30AM
> here) then 1 hour of recording at 44.1K generates 158,760,000 samples.
> At 200PPM my fast interface will generate 31752 samples more than the
> base rate while my slow device will generate 31752 fewer samples.
> Therefore after 1 hour the two streams are separated by 63504 samples.
> In time this is about 1.44 seconds which is certainly audible. Even on
> a 5 minute pop song the difference is about 120 mS which is well into
> the echo range. (Drums and guitar are in sync when the song begins but
> the drums are late at the end.)
> 
> To the best of my knowledge the only way around this is synchronizing
> the audio sample clocks. I do this with an external signal to all of
> my audio sampling devices. It is possible, and why I asked earlier,
> that devices could create their audio sample clock from the time stamp
> being sent by the 1394 bus master but there is, to the best of my
> knowledge again, no requirement that they do so an I doubt many do.

The timestamp is not sent by the 1394 bus master. It is embedded in the
stream, whoever may have sent it. The bus master only defines the value
of the "1394 wall clock" to which these timestamps are referred.

There IS a requirement for devices to respect the embedded timestamps if
they want to be "professional" and spec compliant. And even so, saying
that there is no requirement to use it is insignificant. It is possible.

I completely understand what synchronization means, but it is not really
relevant. The problem with your reasoning is that the clocks of both
devices are NOT free-running. In devices like the firepod that support
aggregation the clocks are generated by a PLL that is locked to the
timestamps of the audio samples received. Hence the clocks are all
locked to one master clock, hence also to each other. The audio sample
clocks ARE synchronized.

I can only suggest to reread what I've written before. The firewire
timestamps (can) provide both rate and phase synchronization within one
bus. There is no need to do any external sync connections provided that
the devices sync to the firewire timestamps. Not all of them do/can sync
to these timestamps, but those are either not AMDTP compliant, or not
"professional" according to the AMDTP spec.

Also note that the context of my statements was the technical merits of
the 1394 bus as a system, not those of specific implementations. So let
me rephrase: "It is a fact that there is a global clock available
throughout the 1394 bus, and that it enables the embedding of all sync
info into the streams themselves, possibly eliminating the need for
external sync when aggregating devices."

Note that this capability is not present in the USB nor the PCI busses.
It is therefore impossible to guarantee phase sync between channels of
different devices. What I mean with this is:

* suppose you have a frame:
{left_1[i], right_1[i], left_2[i], right_2[i]}
that contains 4 samples that are supposed to be simultaneous (since they
have the same sample index).

* suppose you send left_1 and right_1 to device1, and left_2 and right_2
to device2. device1 and device2 are identical and have synchronized
clocks (e.g. wordclock)

* suppose device2 was started X frames after device1. (e.g. ALSA started
both devices, but right after starting device1 the process was
interrupted for a time that corresponds to X samples).

* the "real-life" time where the samples are output (available on the
output jack) is:
left_1, right_1 = T0 + i*Ts
left_2, right_2 = T0 + (i+X)*Ts
where T0 is some unspecified point in time,
and Ts is the sample period (1/Fs)
IOW: the channels from device 2 lag X frames

* since X is dependent on some random event occurring, there is no way
to predict what this time is going to be.

* now suppose you record these 4 channels back to pc, through e.g.
device1. It will sample the signals at some point in "real-life" time,
resulting in a frame:
{in_1[j], in_2[j], in_3[j], in_4[j]}
where:
in_1[j]=left_1[i]
in_2[j]=right_1[i]
in_3[j]=left_2[i-X]
in_4[j]=right_2[i-X]
which can also be expressed as:
in_1[j]=left_1[i]
in_2[j]=right_1[i]
in_3[j+X]=left_2[i]
in_4[j+X]=right_2[i]

So what used to be a frame with the same index i, is now spread over
multiple frame indexes. Moreover, the exact spreading is dependent on
both some random value X, as the external routing used to re-record.

* The availability of a global clock enables the specification of the
"presentation time", i.e. the "real-life time" at which a sample is
output. This eliminates these issues.

* the fact that all samples (can) have a timestamp also means that you
can sync clocks to these streams, since rate is the derivative of phase.

AFAIK the 1394 bus is the only one that provides both phase and rate
sync. It is not possible on USB, PCI, MADI, ethernet, ... without the
use of external systems that distribute some form of absolute time. Word
clock is not enough, that only provides rate sync. You need something
like SMTPE LTC.

So the summary:
1) 1394 is technically superior
2) there is no need for external sync, provided devices use the services
provided by the 1394 bus.

Greets,

Pieter




More information about the Linux-audio-user mailing list