[LAD] twice as loud

Ralf Mardorf ralf.mardorf at alice-dsl.net
Tue Jul 27 16:57:46 UTC 2010


On Tue, 2010-07-27 at 11:10 -0500, Charles Henry wrote:
> On Mon, Jul 26, 2010 at 11:30 PM, Jens M Andreasen
> <jens.andreasen at comhem.se> wrote:
> >
> > On Sun, 2010-07-25 at 14:24 +0200, Philipp Überbacher wrote:
> >
> >> It would be strange but funny if an estimate of sound A just about
> >> masking sound B would correspond to 'twice as loud'.
> >
> > Masking appears somewhere in the 30 to 40dB range - which is way beyond
> > "twice as loud".
> 
> Right.  The two are not related and they're in wholly different orders
> of magnitude.  I'd be reluctant to put a number on it, though...
> 
> Because psychoacoustics just hasn't been defined in a way to make hard
> numbers stick.  The tendency in psychoacoustic experimental design is
> to use discrete conditions (which gives better experimental power) in
> order to show that an effect exists.  But this way, any given
> experiment can't produce results that cover the whole space.
> Generalization and extrapolation are limited.
> 
> A psychoacoustic relationship is a map between a set of acoustically
> presented signals and a set of sensory experiences.  Loudness, pitch,
> timbre are the three terms used to describe sounds in psychoacoustics,
> which might lend one to think they are orthogonal or separable.  The
> problem of describing the non-linear psychoacoustic map is that
> relations don't apply the same way to different neighborhoods in the
> spaces involved.  With appropriate techniques and *lots* of data, we
> could come up with models that describe the curvature of those maps
> locally at each point in the space.  What we think of as loudness is
> just one way of assigning a scale to a path in the space which
> connects sounds of similar pitch and timbre.
> 
> Masking is an interesting effect to look at topologically.  Consider
> that points in the set of sensory experiences may be more or less
> distant from each other based on their degree of similarity.
> 
> Although acoustically, we can have a metric that separates all signals
> from each other, two sounds (psychological) may be in-distinguishable
> from each other.  The topology on this space is determined by a
> pseudo-metric in which d(p1,p2)=0 => p1 and p2 are indistinguishable
> from each other.  This generates a coarse topology with smallest open
> sets consisting of sounds that are indistinguishable from each other.
> 
> Describing the masking effect means finding the inverse image of the
> psychoacoustic map where a collection of distinct acoustic signals map
> onto points in the same open set.
> 
> Suppose we have two signals s1 and s2, and we construct a third sound
> s3=s1+a*s2.  For some range of values of a, s3 can be made
> indistinguishable from s1.  This describes just *one* local dimension
> along which s1 masks s2, as long as a*s2 also corresponds to a
> non-zero point in the psychoacoustic image.
> 
> Well, I just wanted to get a few ideas out there to have some fun with
> this discussion :)  I'm a late-comer since I had some other
> obligations to attend to last week.
> 
> Best,
> Chuck

Masking theories used by audio codecs to compress audio signals don't
work for people who are trained in listening, e.g. MP3 at any rate is a
PITA. Trained people (here where I do live) do always here a loss in
blind tests.

Even when using dummy heads for recordings, 3D psychoacoustic doesn't
work that good (even if the dummy head isn't that bad ;), e.g. there's
always a dependency for e.g. ahead and above.

So for the 'twice as loud' issue you should ask; "In wich way do we
listen and understand what we do listen too?"

Before the brain does "math", are there any other senses involved to the
interpretation of the input given by the ears? (Btw. I'm sure that math
is just part of nature and can't describe nature, because it's just a
part.)

Regarding to the topic that there are two sound sources and you are
thinking about a relation, try to imagine people who are autistic or who
are 'normal', but having a panic attack, or try to remember a situation,
when you barely were able to escape an accident. The filtering is
completely different. Everybody of us is able to focus allegedly masked
"things".

Perception, the interpretation of the input by all senses will change,
regarding to the context, for audio even regarding to the tilting of
your head to the body.

Recordings and digital audio virtualization always has the lack of the
experienced context. If it should be possible to completely gather
everything of nature by math for specific situations, there still will
be the context, a situation were everybody brains is able to count the
peas that drop to the ground, after the glass of peas fall down to the
ground.

Remember your own experiences when you had or nearly had an accident.
Time seemed to be slower and silent, but important sounds (regarding to
your survival) become loud, while loud, but unimportant sounds become
silent.




More information about the Linux-audio-dev mailing list