Christoph Eckert wrote:
Hi,
while doing the videos, I wondered about one thing.
Linux is a multitasking and a multiuser machine.
But: there's no audio layer on the operating system level.
While different users can burn CDs and share the CPU, there's
no possibility included that multiple applications can play
sound - if the applications are started by different users,
the situation is getting worse.
Isn't it a really ugly design that one aplication can grab the
audio device using ALSA or OSS and block it for any other
application?
So, there have been different solutions for this, esound for
Gnome, artsd for KDE. But still, there's no reasonable
default system that can be used by *all* applications, so
application developers can use *one* API instead of having to
implement multiple completely different interfaces (ALSA,
OSS, arts, esound, jack, portaudio...)?
So my question is: Is there work in progress somewhere out
there which will solve this by implementing a common unix
audio layer? Or, will be jack be the layer that will soon be
included into the boot process by all distributors?
Well, I guess JACK is a special soundserver for a special
task, and I guess it lacks some other multimedia features.
So, is there any common acknowledgement concerning unix audio
(you see I avoid linux but using unix instead ;-)?
Any thouht are very welcome.
You are absolutely right, and this is a major issue preventing linux
from gaining popularity among audio professionals.
In a nutshell, and barring all the technical details to get there, there
should be a method by which audio applications can simultaneously, and
easily, write to the speakers . That's really what you are advocating,
right? A user of any audio app wants to hear the output on his speakers
just as a gimp user needs to see the results of a graphics program on
the screen. I don't think it's a complicated issue, but the variations
of low-level audio applications make it out to be. Perhaps the issue
stems from what makes linux so popular -- it's openness.
There is, of course, a difference between visual data and audio data.
For humans, you can separate simultaneous visual information streams in
sections and the information is not necessarily lost to the human - e.g.
windows of graphics or a window of a video stream. A human can
selectively view graphics output presented in different windows. While a
human might miss a video stream event in one window while watching a
second stream, it won't be confusing to the human -- he'll just miss the
event. But, it is difficult to separate different and simultaneous audio
information so that it will be intelligible to humans. You can send the
information to different speakers, but it won't be decipherable by humans.
Business really drove the need for desktop applications. Graphics
functionality eventually became fundamental to this need. Video games
extended the need and drove the competition of graphics technology. But
for the discussion here, it is important to note that video games also
drove the popularity of affordable audio devices on PCs. The two grew up
a bit differently both technically and business-wise. While video games
drove the edge of graphics technology, everybody benefited - business
users, home users and video game enthusiasts. Graphics then, became big
business and businesses grew (nvidia) and died (3dfx) dramatically.
Audio is not that lucky -- there is not as strong a need in the business
community for audio (heck, if it can play a wave file for your
presentation, what else do you need?). And, I know of no other sector
that drives hw/sw audio development as much as video games .
For the *nix crowd, this is why you have x.org on one hand and oss,
alsa, jack, arts, esound, etc... on the other.
-- Bottom line: potential, and realized, ROI
Ok, y'all know this, I'm sure.
I have several reasons to state this obvious phenomenon and this is my
point if you've read this far: even though the two matured differently
and are perceived differently by humans, I don't believe the basic
technical issues of managing and routing the two data is all that different.
I assume that there are several ways that a user can set up linux to
drive graphics to a display, but all I know is Xserver. This viewpoint
begs the question that maybe x.org is the "vehicle" -- both technically
and politically -- to promote and maintain a common method to write to
the speakers. X.org seems rather big, so it might be the wrong
organization, but then again, it's power might make it the right choice.
Perhaps programmers of current low-level audio apps will feel that they
may lose control by approaching this organization. Then again, it might
significantly elevate the need of audio in the linux community by
utilizing the x.org organization as the vehicle to promote and develop a
Unified Audio Driver.
brad