On 06/05/17 13:08, David Kastrup wrote:
1. Different
audio interfaces provide completely different routing
capabilities, so what you want to get isn't that easy to do, if at
all.
Sorry, but that's just handwaving. An audio interface will either be
able to route some input to some output with a given volume or not. The
API to jack (or something doing that kind of job for the connection
management on its behalf) would just request the connection and would
not need to know how it was going to be established. Many many
soundcards already provide amixer controls that can be used for this
functionality if you know how, and the "if you know how" part can be
described in a database.
As the database grows, more cards will transparently support hardware
monitoring.
It most definitely _is_ easy to do since the hard work is in the ALSA
controls of individual drivers (or even other kinds of command line
utilities used for accessing mixers) and those are already there for a
whole lot of cards.
This thread is going on for much too long already.
Here is an actual datapoint wrt. current Hardware:
Take the Focusrite Scarlett class of USB-based Audio Devices (for which
I happened to write the Alsa-mixer driver [based on previous work by
Robin Gareus]).
These devices use the standardized USB-HID-Endpoints for the audio, but
use custom endpoints for accessing the Mixer. Therefore Audio-I/O did
work out of the box, with the standard linux driver, but Mixer-control
needs an hardware-specific driver, for reasons that might become clear
in the following exposition.
For the Scarlett 18i8 there are 18 Inputs and 8 Outputs. Internally
there is a 18x8 mixing matrix (the same(!) matrix size is actually used
for variants with less actual I/O ...).
Actually, the above mentioned I/O numbers are the physical connectors.
Regarding USB, there are also 18 Capture-"Endpoints" and 8
Playback-"Endpoints".
Now the thing is, these Capture/Playback-Endpoints aren't tied to a
specific electrical I/O, but for each Matrix-Input, Physical-Output and
Capture-Input there is a software-configurable - i.e. exposed as alsa
mixer enum-type control - Multiplexer to choose from any of the
available Matrix-Outputs, Physical-Inputs and Playback-Outputs (except
you can't directly connect a Matrix-Output to a Matrix-Input).
That's the - quite flexible - topology of the Device, which can and
should - obviously - allow for hardware-monitoring...
Let's now assume that JACK provides some kind of hardware monitoring
support by detecting direct connections between capture-inputs and
playback-outputs and instead of copying audio data on the software side
chooses to modify the mixer instead (I don't know what --hwmon really does).
Now how could it do that?
First, it needs some kind of mapping which physical i/o corresponds to
which mixer control. As you can see, this entirely depends on the
device's topology. So, for a solution to be any good, it has to support
different topologies. There are at least "Summing" nodes, "Gain
control"
nodes, "Multiplexer" nodes, "Switch" nodes (-> Mute) and
"Duplicate"
nodes required to model current hardware. These nodes could be more or
less arbitrarily connected (source -> sink), like a DAG (directed
acyclic graph).
Secondly, we have have to find a path through this graph from a certain
hardware-input to a certain hardware-output. But Jack actually does not
deal with hardware-inputs and hardware-ouputs, but with capture-inputs
and playback-outputs. For the Scarlett, we know that we have to look at
the capture-mux to find the hardware-input it is connected to (but it
might even be connected to something else). We also look at the
Playback-output and have to find the Hardware output. There are actually
two ways a hardware-output could be connected to a hardware-input:
Either the Hardware-Output-Mux directly selects a Hardware-Input, or it
selects a Matrix-Output where some Matrix-Input is actually connected,
via the Matrix-Input-Mux, to a given hardware-input. Notice how the
device is perfectly capable of sending the monitoring signal to a
hardware-output which is not connected to a capture-input. So the whole
idea that JACK works with capture-inputs instead of physical hardware
inputs is detrimental to what we actually want to achieve. But it does
not stop there: To establish a new monitoring connection, there are now
also *two* possible paths to choose from: Either via the mixer (might
still need some MUX-reconfiguration!), or via the Output-Mux. Except you
can't use the Output mux method if you want to route more than one input
to the same output (i.e. sum them). To add another twist, the
Matrix-hardware does not allow Mute/Unmute (unlike the Output-Masters,
they can be muted). A hardware-specific mixer software can actually
implement these by storing the old value and setting the gain to "0"
(which actually is labeled -128dB, on the Scarlett). As JACK was
(intentionally, AFAIK) designed to not provide any gain control by
itself, there somewhere has to be this "magic" gain value to be used to
"Unmute" a certain Matrix connection. While "0 dB" might seem a
reasonable choice, it clearly falls behind the actual capabilities of
the device and using a fixed value would certainly be an annoying
limitation, in practice.
It basically boils down to: "Who controls the mixer?"
E.g. for MUX settings there has to be one single master who decides how
the device is going to be configured and who is free to make any
changes, as required.
Any software that places itself to become that "master" should better
work with whatever flexibility a given interface has, instead of
arbitrary limiting it. In other words, it has to, more or less, support
all features a device has. JACK is not - and will most certainly never
be - able to handle such things. And IMHO it shouldn't. From a
software-engineering standpoint, a dedicated mixer app as "master" is
the simplest solution, and - by far - the most flexible.
A side node: The alsa mixer stuff is quite horrible. E.g. the ordering
of the controls is determined by some heuristics in the user-space alsa
libs (that takes a snd_hctl_* (hardware control) from the kernel and
provides the snd_ctl_* an application actually works with). To get some
kind of sensible ordering with the about 200 controls the Scarlet driver
exposes, I had to choose the names "just right" to get avoid any special
cases and drop through to the "lexical sorting" case.
And, while there are alsa control-types that could be used for metering
purposes, *no* mixing application supports these; furthermore the device
actually returns the monitoring information as "block" (e.g. "all
captures"-block, "all outputs"-block, ...) and not "per channel",
as
required by alsa. There is no simple way to associate the monitor with
the actual gain control (IIRC, if you choose the names "correctly" alsa
will combine these snd_hctl_* into one snd_ctl_* - at least that's what
happens with Mute). This is the point where some drivers fall back to
providing snd_hwdep_* (hardware dependent api) access, which requires a
corresponding user-space program. Because of the severe lack of
documentation on this part of alsa, the current Scarlett-driver simply
does not support metering.
Also, during the process of upstreaming the driver, the "Save mixer
settings to hardware (NVRAM)" (for standalone operation) had to go,
because alsa just has no appropriate control element type to expose that
feature of the device.
That kind of "let the hardware do the job on its
own whenever it can"
mind frame was what made Firewire the choice for professional work for
a really long time. So the user notices that monitoring with effects
leads to larger time lag and larger CPU load and higher likelihood of
dropouts. He has a choice then. He can switch off the effects.
For most of the
history of digital music/video/..., computers were not
fast enough to do any serious processing. So to get anywhere, reliably,
you simply *had to* use hardware, where ever you can.
Nowadays even digital mixer consoles tend towards software effect
processing / plugins on general purpose CPUs instead of dedicated DSP
chips. The earlier digital audio consoles were also quite limited by the
hardware implementation, e.g. "You can route X to Y, but not to Z,
because reasons". On the other hand, I believe the Focusrite Scarlett is
so flexible, is because it's based on the xCORE chip, which has up 32
CPU/microcontroller cores that can easily implement tasks like "gain",
"mux", ... - because it is all *software*. It's all about *controlling*
latency and load - which is exactly what JACK tries to do.
Tobias