Florian Schmidt wrote:
I've only been thinking about how this is done for very short periods of
time. My naive approach to timestretching would be to transform the
signal into the frequency domain [either by windowe fourier or by
wavelet transform]. and then afterwards retransform, but with a changed
time base. Actually i rather think of it as synthesizing the
timestretched material from the frequency information..
Well, kind of. The idea of the phase vocoder, which more or less
describes what you said,
is to decompose each time-domain frame into N frequency bins, and to
suppose that there is only one underlying stationary sinusoidal in each
frequency canal. If this is the case, you unwrap the phase to have the
frequency of the sinusoid, and you resynthetise it with a
longer or shorter time frame.
The problem is that it demands short windows (for the hypothesis one
stationary sinusoid in each frequency canal to be valid), which means
very poor frequency resolution. Basically, you have to make a trade off
between time resolution and frequency resolution (nothing new here ;)).
So the idea is to adapt the window size to the content of the signal,
which means being able to detect the transient (which are better
stretched with small windows)...
J Bonada wrote his PhD on this subject, if you are interested:
http://www.kug.ac.at/iem/lehre/arbeiten/hammer.pdf
cheers,
David
I'm sure i miss something [haven't actually
looked at the math
[especially for the retransform with changed timebase] though i do have
some fourier/wavelet transform knowledge]
Florian Schmidt