[LAD] Speech noise removal

alex stone compose59 at gmail.com
Mon Mar 9 11:46:10 UTC 2009


On Mon, Mar 9, 2009 at 2:33 PM, Olivier Guilyardi <list at samalyse.com> wrote:

> alex stone wrote:
> >
> > On Mon, Mar 9, 2009 at 1:10 PM, Olivier Guilyardi <list at samalyse.com
> > <mailto:list at samalyse.com>> wrote:
> >
> >     I'm looking for material, docs and/or software to remove speech
> >     noise, as caused
> >     by the movements of the mouth.
> >
> > In the commercial world this one of the strengths of protools, and i
> > know 2 voice editors (full time) who use pt for precisely this task.
> > (Others include Sequoia, Nuendo, etc...)
>
> Does protools include some automatic speech noise detection and/or removal
> utilities?
>
> > There's a lot of work editing voice, so tools that make this type of
> > workflow easier are at a premium. (Ask any foley engineer as an example)
> >
> > I respectfully suggest you take a look at Ardour, Audacity, and Rezound.
>
> I know these of course, not as an advanced user though.
>
> > Ardour, with it's excellent user control over regions, gives the
> > operator good editing opportunities, and although there are a couple of
> > keystroke/editing tools missing when compared to pt, it is already solid
> > and capable for fine tune editing.
>
> It looks like you are talking about 100% manual noise removal. The users
> I'm
> dealing with already do that on protools on Mac. They lose a lot of time
> with
> this. I'm looking for a way to automate this, at least partly. I think they
> really need this, so, if needed, I suppose I could convince them to use
> Ardour
> for this specific task. In which case I'd try to code something to add this
> feature to Ardour.
>
> I suppose there is mainly two ways to do this:
>
> 1 - noise detection, manual removal (as mentionned by Dave on this thread).
> This
> is certainly the most flexible thing, safest, and also the most language
> independent. Because all selected regions could be removed at once it could
> also
> provide a high level of productivity
>
> 2 - complete automatic removal. It would be more language specific (just
> think
> about those tongue sounds in some african languages..), but may still work
> for a
> certain range of European languages (I'm primarily targetting french).
>
> The second solution has the advantage that it could be developed as a
> simple LV2
> plugin or so, so that I wouldn't need to go and deal with the details of
> Ardour
> editing routines, as in the first solution. I can't exactly tell how much
> work I
> will be able to put into this, I might need to restrain my ambitions to
> stay
> within some realistic professional requirements (read: money).
>
> > Audacity, with it's multi track recording, and excellent importing
> > framework, including many tools and effects to manipulate wavs, is a
> > contender too.
> >
> > Rezound, although a bit older in the app stakes, is an excellent editor
> > (in my humble opinion), and has the capability to give the user
> > virtually complete control over recordings, to a degree of excellence
> > that would fit in any studio, full time. I have a colleague in britain
> > who uses Rezound full time for all his album work and voice over editing
> > he does for others, and he's constantly in admiration of what he
> > achieves with this modest but powerful app, particularly when it's put
> > up against some high priced commercial competitors.
>
> I like Rezound too, but the above remarks still apply.
>
> > 3 contenders as initial suggestions. No doubt there will be other
> > offerings from those who do this more frequently than i do.
>
> Thanks. But don't forget this is a LAD question, and that I'm not afraid to
> code
> (that's my job actually).
>
>   Olivier



Olivier, i wasn't aware you have coding skills, so freely ignore anything
i've written that isn't relavent.

If you're intent on automating a speech analysis, voice noise removal device
of some sort, then you might do well to start with a 'pre and post'
framework. Things like lipsmacking, glottal and nasal noise for the end of
phrase, etc, are fairly easy to identify, and generally occur pre and post.
So that may well be a decent percentage of any cleanup done quickly.
(Dependent of course on language. Cleaning up russian would be a different
'module' to cleaning up French, or Finnish.)

And maybe there's a coding clue there too. A modular, language centric
approach, based on loading a module designed specifically for a particular
language and phrasal interpretation. (Module 1 spoken=French, Module 2
sung=French, Module 3 spoken=Finnish, etc....)




Alex.



-- 
Parchment Studios (It started as a joke...)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxaudio.org/pipermail/linux-audio-dev/attachments/20090309/410659a4/attachment.html>


More information about the Linux-audio-dev mailing list