[linux-audio-dev] Pitchshift/Timestretch project..

Cournapeau David cournape at enst.fr
Tue Apr 6 09:51:34 UTC 2004


Florian Schmidt wrote:

>
>I've only been thinking about how this is done for very short periods of
>time. My naive approach to timestretching would be to transform the
>signal into the frequency domain [either by windowe fourier or by
>wavelet transform]. and then afterwards retransform, but with a changed
>time base. Actually i rather think of it as synthesizing the
>timestretched material from the frequency information..
>  
>
Well, kind of. The idea of the phase vocoder, which more or less 
describes what you said,
is to decompose each time-domain frame into N frequency bins, and to 
suppose that there is only one underlying stationary sinusoidal in each 
frequency canal. If this is the case, you unwrap the phase to have the 
frequency of the sinusoid, and you resynthetise it with a
longer or shorter time frame.

The problem is that it demands short windows (for the hypothesis one 
stationary sinusoid in each frequency canal to be valid), which means 
very poor frequency resolution. Basically, you have to make a trade off 
between time resolution and frequency resolution (nothing new here ;)). 
So the idea is to adapt the window size to the content of the signal, 
which means being able to detect the transient (which are better 
stretched with small windows)...

J Bonada wrote his PhD on this subject, if you are interested:

http://www.kug.ac.at/iem/lehre/arbeiten/hammer.pdf

cheers,

David

>I'm sure i miss something [haven't actually looked at the math
>[especially for the retransform with changed timebase] though i do have
>some fourier/wavelet transform knowledge]
>
>Florian Schmidt
>
>  
>




More information about the Linux-audio-dev mailing list