On Mon, Mar 9, 2009 at 2:33 PM, Olivier Guilyardi <list(a)samalyse.com> wrote:
alex stone wrote:
On Mon, Mar 9, 2009 at 1:10 PM, Olivier Guilyardi <list(a)samalyse.com
<mailto:list@samalyse.com>> wrote:
I'm looking for material, docs and/or software to remove speech
noise, as caused
by the movements of the mouth.
In the commercial world this one of the strengths of protools, and i
know 2 voice editors (full time) who use pt for precisely this task.
(Others include Sequoia, Nuendo, etc...)
Does protools include some automatic speech noise detection and/or removal
utilities?
There's a lot of work editing voice, so tools
that make this type of
workflow easier are at a premium. (Ask any foley engineer as an example)
I respectfully suggest you take a look at Ardour, Audacity, and Rezound.
I know these of course, not as an advanced user though.
Ardour, with it's excellent user control over
regions, gives the
operator good editing opportunities, and although there are a couple of
keystroke/editing tools missing when compared to pt, it is already solid
and capable for fine tune editing.
It looks like you are talking about 100% manual noise removal. The users
I'm
dealing with already do that on protools on Mac. They lose a lot of time
with
this. I'm looking for a way to automate this, at least partly. I think they
really need this, so, if needed, I suppose I could convince them to use
Ardour
for this specific task. In which case I'd try to code something to add this
feature to Ardour.
I suppose there is mainly two ways to do this:
1 - noise detection, manual removal (as mentionned by Dave on this thread).
This
is certainly the most flexible thing, safest, and also the most language
independent. Because all selected regions could be removed at once it could
also
provide a high level of productivity
2 - complete automatic removal. It would be more language specific (just
think
about those tongue sounds in some african languages..), but may still work
for a
certain range of European languages (I'm primarily targetting french).
The second solution has the advantage that it could be developed as a
simple LV2
plugin or so, so that I wouldn't need to go and deal with the details of
Ardour
editing routines, as in the first solution. I can't exactly tell how much
work I
will be able to put into this, I might need to restrain my ambitions to
stay
within some realistic professional requirements (read: money).
Audacity, with it's multi track recording,
and excellent importing
framework, including many tools and effects to manipulate wavs, is a
contender too.
Rezound, although a bit older in the app stakes, is an excellent editor
(in my humble opinion), and has the capability to give the user
virtually complete control over recordings, to a degree of excellence
that would fit in any studio, full time. I have a colleague in britain
who uses Rezound full time for all his album work and voice over editing
he does for others, and he's constantly in admiration of what he
achieves with this modest but powerful app, particularly when it's put
up against some high priced commercial competitors.
I like Rezound too, but the above remarks still apply.
3 contenders as initial suggestions. No doubt
there will be other
offerings from those who do this more frequently than i do.
Thanks. But don't forget this is a LAD question, and that I'm not afraid to
code
(that's my job actually).
Olivier
Olivier, i wasn't aware you have coding skills, so freely ignore anything
i've written that isn't relavent.
If you're intent on automating a speech analysis, voice noise removal device
of some sort, then you might do well to start with a 'pre and post'
framework. Things like lipsmacking, glottal and nasal noise for the end of
phrase, etc, are fairly easy to identify, and generally occur pre and post.
So that may well be a decent percentage of any cleanup done quickly.
(Dependent of course on language. Cleaning up russian would be a different
'module' to cleaning up French, or Finnish.)
And maybe there's a coding clue there too. A modular, language centric
approach, based on loading a module designed specifically for a particular
language and phrasal interpretation. (Module 1 spoken=French, Module 2
sung=French, Module 3 spoken=Finnish, etc....)
Alex.
--
Parchment Studios (It started as a joke...)