Dan Mills kirjoitti:
On Mon, 2008-11-17 at 14:54 +0200, Hannu Savolainen
wrote:
Audio (or MIDI) applications produce and consume
audio streams. All they
need to do is to tell what kind of stream (rate, format, etc) they want
to play/record. They can also adjust their input/output volume or select
the recording source if necessary. In this way the 'system tools' (or
any application dedicated for these purposes) can be used to route and
mix the
That is a very simplistic view of audio applications, and while it fits
simple media players and even simple audio editors, it fails horribly in
a large and complex environment. Even something like a jack client
(which is superficially your produce and consume audio streams example),
often needs significant information about timing, DMA sample position
and the like to be able to do its thing.
Getting DMA sample position fits in the picture. You get it from the
'audio stream'. You don't need to bypass the 'system' and to talk
directly to the device.
Clock source is certainly something that some subset
of applications
should be legitimately concerned with, as are things like head amp gain,
phantom power switching and so on. By no means every app needs to care
about this stuff, but that is policy and should not be the business of
an API to define.
The point is that this kind of details should be handled exclusively by
programs dedicated for this kind of purposes.
You may also need to go deep inside the hardware clock/timing if you are
doing very special things like correlating measured brain/EEG signals
with acoustic stimulus played through the sound card.
However if the
application also tries to interact with the hardware (by
opening /dev/mixer in OSS or by doing similar things with ALSA) then
shit will happen. This kind of interaction with hardware may mean that
the application refuses to work with a pseudo/loopback device that hands
the signal directly to Icecast server.
That is what jackd or similar (that happen at a higher level then the
driver) are for, trying to do this at too low a level is always going to
cause pain and compatibility issues as the low level stuff needs to
support all the warts the hardware has.
Consider also that not everything output from a
soundcard is
automatically PCM audio, both AES3 and spdif have non audio modes and
both are useful upon occasion.
These too are attributes of the audio stream. The application simply
needs to tell that it has an AC3 stream.
All this just
because the
developer of the application wanted to add nice features to wrong place.
Or because the developer NEEDED the audio interface in a particular mode
for anything to work (A digital surround encoder would be a reasonable
example - it needs to set the 'non audio' flag in the serial data stream
as otherwise the data makes no sense.
It's responsibility of the system to ensure that integrity of the stream
is preserved (by telling it has a digital bitstream). However the
application should not try to access the actual device to turn the
audio/data bit on or off or to reconfigure the signal path to unity gain.
>> Equally well an audio player application
should just open a connection
>> to the audio device, set the rate/format and then just start to play.
>> They should not try to do things like automatic unmuting the sound card.
>>
That is a policy issue and is probably untrue for at least some
applications and use cases: A softphone that 'just works' for the common
cases is impossible without the ability to set microphone routing and
gain (Otherwise you end up with a lot of faffing about to get the thing
to work).
There is nothing wrong with this. For example the _audio_ API of OSS has
ioctl calls for selecting the recording source and level. In addition
the sofphone may need to open the microphone (audio) device so that
virtual mixer is bypassed (also for security reasons).
Now I may take the view that ideally the softphone
should stay out of
setting that sort of thing, but if I was distributing one, I would like
to have the thing work out of the box in the common cases (it cuts down
on support calls).
Not at all. The application just needs to use the right API (subset)
instead of trying to peek/poke the hardware in some random way.
OTOH a softphone application that always wants to switch the
input/output to microphone/headset may be a pain in the ass.
The thing is, in professional audio (I do theatre sound for a living),
the sort of thing fons in doing is just not that uncommon, lots of
speakers (on delay lines), lots of dynamically changing routing, a feed
to the OB truck (at a different sample rate), positioning audio with
dynamically changing delays and levels, hundreds of cues (that can
dynamically reconfigure routing and desk setup).... Being able to
control the interfaces at a low level (and on the fly from automation
scripts) is **IMPORTANT**.
IMHO it's important that applications developed for given tasks are kept
focused on the that particular task only. An application that takes care
of automation can trigger other applications to do the right things at
the right moment. For example to send SMS to your wife just before the
final curtain (so that she can turn the coffee machine on just before
you arrive). It can trigger sound/light effects or run scripts that
reconfigures this and that between acts. It can send MIDI messages to
move the sliders on the mixing console. However it's not a good idea to
stuff this application with all possible functionality such as audio
players/recorders, EQ, limiters/compresors, audio analyzers and so on.
If they are in separate programs you can use the best available programs
for each purpose instead of sticking with some sticky secondary tools of
othewise fine automator program.
Best regards,
Hannu