On Tue, 2010-07-27 at 11:10 -0500, Charles Henry wrote:
On Mon, Jul 26, 2010 at 11:30 PM, Jens M Andreasen
<jens.andreasen(a)comhem.se> wrote:
On Sun, 2010-07-25 at 14:24 +0200, Philipp Überbacher wrote:
It would be strange but funny if an estimate of
sound A just about
masking sound B would correspond to 'twice as loud'.
Masking appears somewhere in the 30 to 40dB range - which is way beyond
"twice as loud".
Right. The two are not related and they're in wholly different orders
of magnitude. I'd be reluctant to put a number on it, though...
Because psychoacoustics just hasn't been defined in a way to make hard
numbers stick. The tendency in psychoacoustic experimental design is
to use discrete conditions (which gives better experimental power) in
order to show that an effect exists. But this way, any given
experiment can't produce results that cover the whole space.
Generalization and extrapolation are limited.
A psychoacoustic relationship is a map between a set of acoustically
presented signals and a set of sensory experiences. Loudness, pitch,
timbre are the three terms used to describe sounds in psychoacoustics,
which might lend one to think they are orthogonal or separable. The
problem of describing the non-linear psychoacoustic map is that
relations don't apply the same way to different neighborhoods in the
spaces involved. With appropriate techniques and *lots* of data, we
could come up with models that describe the curvature of those maps
locally at each point in the space. What we think of as loudness is
just one way of assigning a scale to a path in the space which
connects sounds of similar pitch and timbre.
Masking is an interesting effect to look at topologically. Consider
that points in the set of sensory experiences may be more or less
distant from each other based on their degree of similarity.
Although acoustically, we can have a metric that separates all signals
from each other, two sounds (psychological) may be in-distinguishable
from each other. The topology on this space is determined by a
pseudo-metric in which d(p1,p2)=0 => p1 and p2 are indistinguishable
from each other. This generates a coarse topology with smallest open
sets consisting of sounds that are indistinguishable from each other.
Describing the masking effect means finding the inverse image of the
psychoacoustic map where a collection of distinct acoustic signals map
onto points in the same open set.
Suppose we have two signals s1 and s2, and we construct a third sound
s3=s1+a*s2. For some range of values of a, s3 can be made
indistinguishable from s1. This describes just *one* local dimension
along which s1 masks s2, as long as a*s2 also corresponds to a
non-zero point in the psychoacoustic image.
Well, I just wanted to get a few ideas out there to have some fun with
this discussion :) I'm a late-comer since I had some other
obligations to attend to last week.
Best,
Chuck
Masking theories used by audio codecs to compress audio signals don't
work for people who are trained in listening, e.g. MP3 at any rate is a
PITA. Trained people (here where I do live) do always here a loss in
blind tests.
Even when using dummy heads for recordings, 3D psychoacoustic doesn't
work that good (even if the dummy head isn't that bad ;), e.g. there's
always a dependency for e.g. ahead and above.
So for the 'twice as loud' issue you should ask; "In wich way do we
listen and understand what we do listen too?"
Before the brain does "math", are there any other senses involved to the
interpretation of the input given by the ears? (Btw. I'm sure that math
is just part of nature and can't describe nature, because it's just a
part.)
Regarding to the topic that there are two sound sources and you are
thinking about a relation, try to imagine people who are autistic or who
are 'normal', but having a panic attack, or try to remember a situation,
when you barely were able to escape an accident. The filtering is
completely different. Everybody of us is able to focus allegedly masked
"things".
Perception, the interpretation of the input by all senses will change,
regarding to the context, for audio even regarding to the tilting of
your head to the body.
Recordings and digital audio virtualization always has the lack of the
experienced context. If it should be possible to completely gather
everything of nature by math for specific situations, there still will
be the context, a situation were everybody brains is able to count the
peas that drop to the ground, after the glass of peas fall down to the
ground.
Remember your own experiences when you had or nearly had an accident.
Time seemed to be slower and silent, but important sounds (regarding to
your survival) become loud, while loud, but unimportant sounds become
silent.