On Mon, Mar 9, 2009 at 2:59 PM, Olivier Guilyardi
<list@samalyse.com> wrote:
alex stone wrote:
> If you're intent on automating a speech analysis, voice noise removal
> device of some sort, then you might do well to start with a 'pre and
> post' framework. Things like lipsmacking, glottal and nasal noise for
> the end of phrase, etc, are fairly easy to identify, and generally occur
> pre and post. So that may well be a decent percentage of any cleanup
> done quickly. (Dependent of course on language. Cleaning up russian
> would be a different 'module' to cleaning up French, or Finnish.)
That sounds encouraging. What to you mean by "pre and post" (sorry if that's an
obvious question to you)?
> And maybe there's a coding clue there too. A modular, language centric
> approach, based on loading a module designed specifically for a
> particular language and phrasal interpretation. (Module 1 spoken=French,
> Module 2 sung=French, Module 3 spoken=Finnish, etc....)
I think you're right. There's nothing universal about making noise and actually
meaning something ;)
Olivier
Pre and post meaning the start and finish of a recorded wav or region. Example being the first few, and the last few, milliseconds or so. Most of this would be obvious to the ear, so i can imagine a means to edit this could be mechanised in some way. (Being
careful, of course, not to dehumanise the original recording too far.)