On Tue, 06 Apr 2004 11:51:34 +0200
Cournapeau David <cournape(a)enst.fr> wrote:
Well, kind of. The idea of the phase vocoder, which
more or less
describes what you said,
is to decompose each time-domain frame into N frequency bins, and to
suppose that there is only one underlying stationary sinusoidal in
each frequency canal. If this is the case, you unwrap the phase to
have the frequency of the sinusoid, and you resynthetise it with a
longer or shorter time frame.
The problem is that it demands short windows (for the hypothesis one
stationary sinusoid in each frequency canal to be valid), which means
very poor frequency resolution.
Actually it gives very poor frequency resolution at low frequencies [
1/F > frame ] if i remember my math right.
Basically, you have to make a trade
off between time resolution and frequency resolution (nothing new here
;)). So the idea is to adapt the window size to the content of the
signal, which means being able to detect the transient (which are
better stretched with small windows)...
This is, IMHO, the preferred application domain for wavelet transform.
By using the multiscale nature of wavelet transform you get the best of
both worlds.. You get good frequency resolution across the frequency
range, and also a good time resolution for high frequency material.. I
suppose lesser time resolution on low frequency components is ok, since
the time resolution of the human ear for low frequencies isn't as good
as for high freqs either.. But my math is a bit shaky on the subject..
Also my assumption about the time resolution of the human ear concerning
low frequency material might be wrong..
J Bonada wrote his PhD on this subject, if you are interested:
http://www.kug.ac.at/iem/lehre/arbeiten/hammer.pdf
Thanks! I'll take a look..
--
kT