On Mon, 2005-06-06 at 10:43 -0700, Brad Arant wrote:
Much of the latest speech recognition innovations use
neural network
technology with back propogation for training and learning. They can be
trained to recognize a wide range of voice types and can detect works strung
together into normal speech. The input to the neural net is a formant
analysis using fft to create the harmonic pattern. With proper arrangment it
will even accomodate variances in the speed of speech as well as whether the
voice is male or female. It can also return a signal of the inflections made
by the speaker.
It is an item that has been studied for years in the computer science realm
and there is no quick solution to do it well.
Actually the interesting work is in the perception / cognitive
psychology area. Once you have this the CS side is pretty simple.
Lee