[linux-audio-dev] Re: Speech analysis
Lee Revell
rlrevell at joe-job.com
Mon Jun 6 18:10:59 UTC 2005
On Mon, 2005-06-06 at 10:43 -0700, Brad Arant wrote:
> Much of the latest speech recognition innovations use neural network
> technology with back propogation for training and learning. They can be
> trained to recognize a wide range of voice types and can detect works strung
> together into normal speech. The input to the neural net is a formant
> analysis using fft to create the harmonic pattern. With proper arrangment it
> will even accomodate variances in the speed of speech as well as whether the
> voice is male or female. It can also return a signal of the inflections made
> by the speaker.
>
> It is an item that has been studied for years in the computer science realm
> and there is no quick solution to do it well.
Actually the interesting work is in the perception / cognitive
psychology area. Once you have this the CS side is pretty simple.
Lee
More information about the Linux-audio-dev
mailing list