[linux-audio-dev] Pitchshift/Timestretch project..
cournape at enst.fr
Tue Apr 6 09:51:34 UTC 2004
Florian Schmidt wrote:
>I've only been thinking about how this is done for very short periods of
>time. My naive approach to timestretching would be to transform the
>signal into the frequency domain [either by windowe fourier or by
>wavelet transform]. and then afterwards retransform, but with a changed
>time base. Actually i rather think of it as synthesizing the
>timestretched material from the frequency information..
Well, kind of. The idea of the phase vocoder, which more or less
describes what you said,
is to decompose each time-domain frame into N frequency bins, and to
suppose that there is only one underlying stationary sinusoidal in each
frequency canal. If this is the case, you unwrap the phase to have the
frequency of the sinusoid, and you resynthetise it with a
longer or shorter time frame.
The problem is that it demands short windows (for the hypothesis one
stationary sinusoid in each frequency canal to be valid), which means
very poor frequency resolution. Basically, you have to make a trade off
between time resolution and frequency resolution (nothing new here ;)).
So the idea is to adapt the window size to the content of the signal,
which means being able to detect the transient (which are better
stretched with small windows)...
J Bonada wrote his PhD on this subject, if you are interested:
>I'm sure i miss something [haven't actually looked at the math
>[especially for the retransform with changed timebase] though i do have
>some fourier/wavelet transform knowledge]
More information about the Linux-audio-dev