On Thu, Jun 12, 2014, at 09:41 AM, Fons Adriaensen wrote:
There's a paper by Dolson and Laroche (1999) which
is a 'must read'
for anyone dabbling with phase vocoders.
One warning, the method described in it is patented -- I had to do a
hasty rewrite in some early Rubber Band code. I don't know whether
anyone enforces the patent though.
For code, maybe have a look at Rubberband which may
contain
interesting things (I don't know, never dared to look
Probably wise -- I expect it would horrify you. And of course this
application is an incredible time-sink simply because there's no right
way to do it. It's a subject that can surely drive you mad.
You gave a low-level example of the problem earlier (with neighbouring
frequency bins). Looking at it at a high level, you're basically trying
to synthesise a signal that corresponds to "what the same instruments
would have sounded like if they were playing slower" (or higher, or
whatever). You don't have anything like enough knowledge frame-by-frame
to actually do that. You can get closer for many signals with a
sinusoidal modelling decomposition (in which you track the frequencies
that appear to be consistent frame-to-frame, adjust their phases, and
treat the rest as noise whose phase you don't change) but those methods
are still fairly expensive to do and of course even the best method can
never actually be technically correct.
Rubber Band does basically the same sums as you just gave. By default it
fudges the time/frequency question for neighbouring bins by taking
groups of bins that appear to be moving in the same direction (in
frequency) and giving each one a phase advance somewhere between its
single-bin predicted value and the value that would be expected if the
group were all following the same path. That's mathematically... barely
supportable at all, but in many cases it sounds OK.
The other thing it does by default is reset the stretch factor and
revert to the input phases when a sufficiently noisy transient is found,
which is why you get quite poppy transients in e.g. drum loops that are
either satisfying or unrealistic depending on your point of view.
Chris