On Mon, Jun 25, 2018 at 4:31 PM, Tim <termtech@rogers.com> wrote:
I read they use more than just spectral stuff.
Like AI used in speech recognition and so on.

Amazing what DSP audio and image coding can do these days.
Any thoughts on coding techniques? I've read a lot of papers!
Some say using FFTs + auto-correlation comparisons.
Some say non-negative matrix.
My head spins, but this team definitely deserves praise.
Can open source come up with something?


Had I sufficient time I'd like to investigate using hidden markov models and turbo coding the way some speech recognition algorithms do. This could give you estimates of the key you are playing in and use that data to calculate the likelihood of various pitches etc etc. I suspect you could get some interesting results.

I don't know of anyone really working on polyphonic pitch recognition in the open source world. I think Bayesian filtering of some kind though would be compelling. Perhaps some of the work from ISSE (http://isse.sourceforge.net/) could be used and made realtime.

_Spencer