[linux-audio-dev] Pitchshift/Timestretch project..

Cournapeau David cournape at enst.fr
Tue Apr 6 07:09:00 UTC 2004

Christian Schoenebeck wrote:

>Es geschah am Montag, 5. April 2004 23:33 als Erik de Castro Lopo schrieb:
>>Well me. I've been working on this since the start of the year, but
>>been thinking about the problem for over 10 years.
>Which brings me to the question: how old are you? :P
>Just kidding, But I also planned to do some research on pitchshifting in 
>conjunction with formant correction. If anybody has good material about that, 
>don't hide it! Seems this field screams for getting elaborated.
I have already played with that stuff, only on the pure dsp side. For 
time scaling, one of the best approach, I think, is to use a phase 
vocoder with phase locking and adaptative frame size. The trick it to 
detect correctly the transients in the file: when there is a transient, 
choose a small frame (ie ~256 samples at 44100), when there is no 
transient, choose a bigger frame.

Check for example the article of dolson and Laroche on phase locked 
phase vocoder

4  *Improved phase vocoder time-scale modification of audio*
/Laroche, J.; Dolson, M.;/
Speech and Audio Processing, IEEE Transactions on  ,Volume: 7 , Issue: 3 
, May 1999
Pages:323 - 332

About time transient detection, I happened to do my DSP master-like in 
England in the same lab than C. Duxburry, who di his PhD about that 
topic. Look for example the articles available at dafx.de. Look also to 
Axel Robel articles; Axel Robel worked at IRCAM, presently.

One other idea I had recently, but havn't tried, is using Matching 
Pursuit to have a frequency/scale/time atoms decomposition, and play 
with it. In fact, basic time based time stretching techniques are kind 
of ganular synthesis ( see for example live! of ableton or variphrase), 
the idea of Matching Pursuit is to find an adaptative decomposition 
using dictionnary, which are redundant set of bases vectors, and split 
the signal in time/scale/frequency domain. The algorithms are quite 
difficult and very computive intensive. but the theory is beautiful. I 
used myself this technique to do multi pitch detection, following an 
article from Gribonval and Bacri, some results are impressive. Look for 
harmonic Matching Pursuit if you are interested.



More information about the Linux-audio-dev mailing list