Dear all,
I'm a maintainer of ALSA firewire stack. Linux kernel v5.14 was out a few
days ago[1], including some changes in ALSA firewire stack. The changes
bring improvement for usage of including drivers by solving some issues.
I appreciate the users cooperating for it[2].
This message includes two topics about solved issues in the release:
1. get rid of playback noise by recovering media clock
2. allow some applications to run without periodical hardware interrupts
and another topic:
3. device aggregation
Let me describe the two topics first.
1. get rid of playback noise by recovering media clock
Many users had been reporting playback noise since the initial version of
each driver in ALSA firewire stack. The cause of the issue is complicated
to explain, but let me roughly summarize it to a point below:
* mismatch between audio sample count in playback stream and the one
expected by hardware
Since the initial stage of ALSA firewire stack, included drivers transfer
audio data frames per second the exact count as sampling frequency,
which is configured via ALSA PCM interface; e.g. 44.1 kHz.
But it is figured out that it is not suitable for many models. For recent
years, I've measured actual packets from/to various models with Windows
and OS X drivers[3], and realized the below phenomena. Here, the configured
frequency is called 'nominal', and the measured frequency is called
'effective'.
* the effective frequency is not the same as the nominal frequency, less
or greater by several audio data frames (<= 10 frames)
* the effective frequency is not even in successive seconds for some
models
The phenomena mean that it is not achieved to transfer samples for playback
sound by nominal frequency, and computation for even number of samples per
second for some models.
Additionally, in isochronous communication of IEEE 1394, part of models
support time stamp per isochronous packet[4]. When parsing the sequence of
time stamp and compare it to frequency of samples included in the packets,
I realize the phenomena below:
* the phase of sample based on computed time stamp shifts during long
packet streaming
* before and after configuring source of sampling clock to external
signal input such as S/PDIF, neither the effective frequency of samples
in packets nor the sequence of time stamp in packets have difference.
The phenomena give us some insights. At least, the phase of samples is
not deterministic somehow in driver side. It is required to recover the
timing to put audio data frame into packet according to packets
transferred by the hardware. This is called 'media clock recovery'[5].
In engineering field, many method of media clock recovery has been
invented for each type of media. The way which ALSA firewire stack in
v5.14 uses is the simplest one. It is to replay the sequence in
transferred packets[6][7][8]. The result looks better. As long as I
tested, it can cover all of models supported by ALSA firewire stack.
2. allow applications to run independently of periodical hardware interrupts
ALSA PCM interface has hardware parameter for runtime of PCM substream to
process audio data frame without scheduling periodical hardware
interrupts[9]. PulseAudio and PipeWire use the function.
All of drivers[10] in ALSA firewire stack now support it. Linux FireWire
subsystem has function to flush queued packet till the most recent
isochronous cycle. The function is available in process context without
waiting for scheduled hardware interrupts, and allows drivers to achieve
the topic. In most cases, it's done by calling ioctl(2) with
SNDRV_PCM_IOCTL_HWSYNC. The call is the part of routine in usual ALSA
PCM application, thus users do not need to take extra care of it.
I think these improvements and configurable size of PCM buffer added in
Linux kernel v5.5 brings better experience to users.
The rest of topic comes from comparison to what existent userspace driver,
libffado2[11], does.
3. device aggregation
Some user pointed that it is not available with drivers in ALSA firewire
stack to aggregate several audio data stream into one stream. It is what
libffado2 does. Let me describe my opinion about it.
At first, let me start with fundamental attribute of audio data frame. In
my understanding about ALSA PCM interface, audio data frame is a unit for
audio data transmission. The audio data frame includes the specific number
of audio data depending on hardware; e.g. 2 samples for usual sound device.
The fundamental attribute of audio data frame is to include the set of
audio data sampled at the same time.
The goal of aggregating audio data stream is to construct an audio data
frame from some audio data frames for several devices. It means that one
audio data frame consists of audio data sampled at different time.
As I describe the phenomena about nominal and effective frequency, each
hardware seems to run own unique effective frequency time to time[12], at
least over IEEE 1394 bus. Additionally, we have the experience that the
hardware is not aware of sequence of packet with nominal frequency for sample
synchronization. It might be legitimate that we can not pick up audio data
frame which consists of audio data sampled at the same time even if they
are transferred at the same isochronous cycle[13].
When achieving the aggregation, we would need to loosen up the fundamental
attribute of audio data frame, by accepting the range of sampling time for
audio data in the frame, or need to implement one of resampling methods
to adjust phase of audio data to the frame.
Although the aggregation is itself superficially useful, it seems not to
be a requirement to device driver in hardware abstraction layer of general
purpose operating system. It may be over engineering.
[1] Linux 5.14
https://lore.kernel.org/lkml/CAHk-=wh75ELUu99yPkPNt+R166CK=-M4eoV+F62tW3TVg…
[2] The cooperation is done in my public repository in
github.com:
https://github.com/takaswie/snd-firewire-improve
[3] The method is described in the message:
[alsa-devel] IEEE 1394 isoc library, libhinoko v0.1.0 release
https://lore.kernel.org/alsa-devel/20190415153053.GA32090@workstation/
[4] The resolution of time stamp is 24.576 MHz, which is the same as
specification of cycle time in IEEE 1394. The method to compute time
stamp of packet for samples in the packet is defined by IEC 61883-6.
We can see integrated document for it published by industry group:
https://web.archive.org/web/20210216003054/http://1394ta.org/wp-content/upl…
[5] I borrow the expression from IEEE 1722. We can see specific term,
sampling transmission frequency (STF) in IEC 61883-6 to express similar idea
of the media clock.
[6] [PATCH 0/3] ALSA: firewire: media clock recovery for syt-aware devices
https://lore.kernel.org/alsa-devel/20210601081753.9191-1-o-takashi@sakamocc…
[7] [PATCH 0/6] ALSA: firewire: media clock recovery for syt-unaware devices
https://lore.kernel.org/alsa-devel/20210531025103.17880-1-o-takashi@sakamoc…
[8] [PATCH 0/3] ALSA: firewire-motu: media clock recovery for sph-aware devices
https://lore.kernel.org/alsa-devel/20210602013406.26442-1-o-takashi@sakamoc…
[9] SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP. When the PCM substream has a
flag of SNDRV_PCM_INFO_NO_PERIOD_WAKEUP, it's available.
[10] Precisely except for snd-isight.
[11]
http://www.ffado.org/
[12] Precisely the hardware looks to run own unique media clock over
IEEE 1394 bus.
[13] For precise discussion, the knowledge about IEC 61883-6 and vendor
specific method for packetization is required.
Regards
Takashi Sakamoto