[linux-audio-dev] ableton live

Cournapeau David david.cournapeau at elec.qmul.ac.uk
Sat Jul 5 11:26:01 UTC 2003


Frank Barknecht wrote:

>Hallo,
>Adrian Gschwend hat gesagt: // Adrian Gschwend wrote:
>
>  
>
>>But as far as I know timestreching algorithms are 1. not easy to
>>implement and 2. not open source if they sound good :)
>>    
>>
>
>I'm quite sure, that Live uses a granular approach. If you timestretch
>far away from the original, it gets obvious. There are several open
>source granular synths. For incorporating into another software, I
>would recommend SndObj which I use in the syncgrain~ external for Pd.
>
>SndObj lives at: http://www.may.ie/academic/music/musictec/SndObj/main.html
>
>
>ciao
>  
>
I have already tried several basic algorithms for time stretching. You 
can basically split all the schemes into 2 different methods : the time 
domain
and the spectral domain ( using a phase vocoder ).

In time domain, there are several algorithms :
 - the OLA method, which is straight forward, and is just to move blocks 
of memory into buffer. Sounds really bad.
 - SOLA method : use correlation to connect different frames ( grains is 
not accurate, I think, as grains are normally blocks of a few ms, and the
frames used in time domain technique have typically 2048 samples at 44.1khz  )
 - WSOLA method : quite the same, but more efficient.

A good reference is "Time scale Modification algorithm based on the 
subband Time domain techniques for broad band signal Applications", from 
Roland K.C. Tan and Amerson H.J.Lin , int J Audio Eng Sic, vol48, number 
5 5000 May.

Phase vocoder :
- the basic approach ( see http://www.musicdsp.org : I posted a matlab 
script. As the script only uses FFT, it must be easy to port it to 
scilab or octave of you don't have access to matlab ).
 - all the different approaches are derived from the basic one. The main 
problem is that a phase vocoder has to make a compromise between 
frequency resolution and time resolution. The idea is basically to 
adapat the frame size to the "harmonic vs transient" content ( a bit 
like in MP3, for exemple, where the frame size of MDCT is adapted ).

The fundation article is this one : "time-scale modifications of speech 
based on short time Fourier analysis" IEEE Trans Acoust, Speech, Signal 
Processing vol 29(3), pp 374-390, 1981. From R. Portnoff

More recent techniques ( with improvements in efficiency and quality ) :

J Laroche and Mark Dolson "New Phase Vocoder Techniques for Real Time 
Pitch Shifting, chorusing..." in JAES, vol47, No11, 1999. It describes 
the phase locking scheme.

I may be give you even more recent articles, as one of my collegue did 
his thesis about it. I just have to find out if it has already been 
published.

P.S : the time stretching in the new Windows Media Player is really 
good. I don't know which technique it uses.




More information about the Linux-audio-dev mailing list