[LAD] ALSA doumentation

Hannu Savolainen hannu at opensound.com
Mon Nov 17 17:54:23 UTC 2008

Dan Mills kirjoitti:
> On Mon, 2008-11-17 at 14:54 +0200, Hannu Savolainen wrote:
>> Audio (or MIDI) applications produce and consume audio streams. All they 
>> need to do is to tell what kind of stream (rate, format, etc) they want 
>> to play/record. They can also adjust their input/output volume or select 
>> the recording source if necessary. In this way the 'system tools' (or 
>> any application dedicated for these purposes) can be used to route and 
>> mix the
> That is a very simplistic view of audio applications, and while it fits
> simple media players and even simple audio editors, it fails horribly in
> a large and complex environment. Even something like a jack client
> (which is superficially your produce and consume audio streams example),
> often needs significant information about timing, DMA sample position
> and the like to be able to do its thing.  
Getting DMA sample position fits in the picture. You get it from the 
'audio stream'. You don't need to bypass the 'system' and to talk 
directly to the device.
> Clock source is certainly something that some subset of applications
> should be legitimately concerned with, as are things like head amp gain,
> phantom power switching and so on. By no means every app needs to care
> about this stuff, but that is policy and should not be the business of
> an API to define. 
The point is that this kind of details should be handled exclusively by 
programs dedicated for this kind of purposes.

You may also need to go deep inside the hardware clock/timing if you are 
doing very special things like correlating measured brain/EEG signals 
with acoustic stimulus played through the sound card.

>> However if the application also tries to interact with the hardware (by 
>> opening /dev/mixer in OSS or by doing similar things with ALSA) then 
>> shit will happen. This kind of interaction with hardware may mean that 
>> the application refuses to work with a pseudo/loopback device that hands 
>> the signal directly to Icecast server. 
> That is what jackd or similar (that happen at a higher level then the
> driver) are for, trying to do this at too low a level is always going to
> cause pain and compatibility issues as the low level stuff needs to
> support all the warts the hardware has. 

> Consider also that not everything output from a soundcard is
> automatically PCM audio, both AES3 and spdif have non audio modes and
> both are useful upon occasion. 
These too are attributes of the audio stream. The application simply 
needs to tell that it has an AC3 stream.
>> All this just because the 
>> developer of the application wanted to add nice features to wrong place.
> Or because the developer NEEDED the audio interface in a particular mode
> for anything to work (A digital surround encoder would be a reasonable
> example - it needs to set the 'non audio' flag in the serial data stream
> as otherwise the data makes no sense.
It's responsibility of the system to ensure that integrity of the stream 
is preserved (by telling it has a digital bitstream). However the 
application should not try to access the actual device to turn the 
audio/data bit on or off or to reconfigure the signal path to unity gain.

>>>> Equally well an audio player application should just open a connection
>>>> to the audio device, set the rate/format and then just start to play.
>>>> They should not try to do things like automatic unmuting the sound card.
> That is a policy issue and is probably untrue for at least some
> applications and use cases: A softphone that 'just works' for the common
> cases is impossible without the ability to set microphone routing and
> gain (Otherwise you end up with a lot of faffing about to get the thing
> to work). 
There is nothing wrong with this. For example the _audio_ API of OSS has 
ioctl calls for selecting the recording source and level. In addition 
the sofphone may need to open the microphone (audio) device so that 
virtual mixer is bypassed (also for security reasons).
> Now I may take the view that ideally the softphone should stay out of
> setting that sort of thing, but if I was distributing one, I would like
> to have the thing work out of the box in the common cases (it cuts down
> on support calls). 
Not at all. The application just needs to use the right API (subset) 
instead of trying to peek/poke the hardware in some random way.

OTOH a softphone application that always wants to switch the 
input/output to microphone/headset may be a pain in the ass.
> The thing is, in professional audio (I do theatre sound for a living),
> the sort of thing fons in doing is just not that uncommon, lots of
> speakers  (on delay lines), lots of dynamically changing routing, a feed
> to the OB truck (at a different sample rate), positioning audio with
> dynamically changing delays and levels, hundreds of cues (that can
> dynamically reconfigure routing and desk setup).... Being able to
> control the interfaces at a low level (and on the fly from automation
> scripts) is **IMPORTANT**. 
IMHO it's important that applications developed for given tasks are kept 
focused on the that particular task only. An application that takes care 
of automation can trigger other applications to do the right things at 
the right moment. For example to send SMS to your wife just before the
final curtain (so that she can turn the coffee machine on just before 
you arrive). It can trigger sound/light effects or run scripts that 
reconfigures this and that between acts. It can send MIDI messages to 
move the sliders on the mixing console. However it's not a good idea to 
stuff this application with all possible functionality such as audio 
players/recorders, EQ, limiters/compresors, audio analyzers and so on. 
If they are in separate programs you can use the best available programs 
for each purpose instead of sticking with some sticky secondary tools of 
othewise fine automator program.

Best regards,


More information about the Linux-audio-dev mailing list