[linux-audio-dev] Re: Speech analysis

Lee Revell rlrevell at joe-job.com
Mon Jun 6 18:10:59 UTC 2005


On Mon, 2005-06-06 at 10:43 -0700, Brad Arant wrote:
> Much of the latest speech recognition innovations use neural network
> technology with back propogation for training and learning. They can be
> trained to recognize a wide range of voice types and can detect works strung
> together into normal speech. The input to the neural net is a formant
> analysis using fft to create the harmonic pattern. With proper arrangment it
> will even accomodate variances in the speed of speech as well as whether the
> voice is male or female. It can also return a signal of the inflections made
> by the speaker.
> 
> It is an item that has been studied for years in the computer science realm
> and there is no quick solution to do it well.

Actually the interesting work is in the perception / cognitive
psychology area.  Once you have this the CS side is pretty simple.

Lee




More information about the Linux-audio-dev mailing list