Hello all,
I'm working on an improved version if zita-at1 which most of you
probably know as the x42-autotune plugin. The update, zita-at2,
will preserve formants while retuning.
To test and develop this I need some clean vocal tracks, in
particular of female singers and also very low (bass) males.
So if you have these available, I'd be very happy if you can
share them. They won't be used of course for any other purpose
than to improve the retuner algorithm.
TIA for anything you can provide !
--
FA
Ratatouille is a Neural Model loader and mixer for Linux/Windows.
![Ratatouille](https://github.com/brummer10/Ratatouille.lv2/blob/main/Ratatouille.png?raw=true)
Ratatouille allow to load up to two neural model files and mix there
output. Those models could be [*.nam files](https://tonehunt.org/all) or
[*.json or .aidax files](https://cloud.aida-x.cc/all). So you could
blend from clean to crunch for example, or, go wild and mix different
amp models, or mix a amp with a pedal simulation.
To round up the sound it allow to load up to two Impulse Response files
and mix there output as well. You could try the wildest combinations,
or, be conservative and load just your single preferred IR-File.
Each neural model may have a different expected Sample Rate, Ratatouille
will resample the buffer to match that.
On the Release Page been ready to use binaries, note please that
Ratatouille.lv2-v3-v0.2-linux-x86_64.tar.xz is a fully optimized version
using the x86-64-v3 optimisation, you could check if your system
supports that by running the command
`/usr/lib64/ld-linux-x86-64.so.2 --help 2>/dev/null | grep 'x86-64-v3
(supported'`
if that return nothing, your system can't use this, in that case you
should choose Ratatouille.lv2-v0.2-linux-x86_64.tar.xz
Impulse Response Files will be resampled on the fly to match the session
Sample Rate.
To build from source, please use Ratatouille.lv2-v0.2-src.tar.xz as only
that contain the needed submodules.
Release Page:
https://github.com/brummer10/Ratatouille.lv2/releases/tag/v0.2
Project Page:
https://github.com/brummer10/Ratatouille.lv2
Hello all,
Several people have asked how the pitch estimation
in zita-at1 works.
The basic method is to look at the autocorrelation
of the signal. This is a measure of how similar a
signal is to a time-shifted version of itself. It
can be computed efficiently as the inverse FFT of
the power spectrum.
In many cases the strongest autocorrelation peak
corresponds to the fundamental period. But this can
easily get ambiguous as there will also be peaks at
integer multiples of that period, and for strong
harmonics. To avoid errors it is necessary to look
also at the signal spectrum and level, and combine
all that info in some way. How exactly is mostly a
matter of trial and error. Which is why I need more
examples.
Have a look at
<http://kokkinizita.linuxaudio.org/linuxaudio/pitchdet1.png>
This a test of the pitch detection algorithm used in
zita-at1.
The X-axis is time in seconds, a new pitch estimate is
made every 10.667 ms (512 samples at 48 kHz).
Vertically we have autocorrelation, the Y-axis is in
samples. Red is positive, blue negative. The green dots
are the detected pitch period, zero means unvoiced.
The blue line on top is signal level in dB.
Note how this singer has a habit of letting the pitch
'droop', by up to an octave, at the end of a note. He
is probably not aware of it. This happens at 28.7s,
again at 30.8s, and in fact during the entire track.
What should an autotuner do with this ? Turn the glide
into a chromatic scale ? The real solution here would
be to edit the recording, adding a fast fadeout just
before the 'droop'. Even a minimal amount of reverb
will hide this.
The fragment from 29.7 to 30.3s is an example of a
vowel with very strong harmonics which show up as
the red bands below the real pitch period. In this
case the 2nd and 3rd harmonic were actually about 20
dB stronger than the fundamental. This is resolved
because the autocorrelation is still strongest at
the fundamental pitch.
The very last estimate in the next fragment (at 30.85s)
is an example of where this goes wrong, the algorithm
selects twice the real pitch period, assuming the
first autocorrelation peak is the 2nd harmonic.
This happens because there was significant energy
at the subharmonic, actually leakage from another
track via the headphone used by singer.
The false 'voiced' detection at 30.39s is also the
result of a signal leaking via the headphone.
Ciao,
--
FA
Hello all,
Zita-at1-0.8.1 is now available at the usual place:
<http://kokkinizita.linuxaudio.org/linuxaudio/downloads/index.html>
Note: this is not (yet) the new zita-at2 that will have formant
correction.
Changes:
- Bug fixes.
- Improved pitch estimation algorithm.
- Low latency mode, reduces latency to around 10 ms.
Note to Robin Gareus:
The new Retuner class can probably be used without changes
in the x42-autotuner plugin.
The logic that controls jumping forward/back while resampling
has been changed. Instead of just trying to avoid reading
outside the available input range, it now tries to keep the
read index as close as possible to the ideal position, i.e.
'latency' samples behind the write index. That also means
that for unvoiced input it will be within +/- 1.3 ms of the
ideal position, so nothing special is required for this case.
Ciao,
--
FA
Hi everyone,
I am hoping this is the right place to post this question.
I am trying to implement the resampling algorithm as described in this
paper: https://ccrma.stanford.edu/~jos/resample/resample.pdf
I managed to construct the h (windowed sinc) table, and I have
implemented the loop. I am having a hard time to compute the eta
interpolation factor and the l index index for the h table. The "fixed
point" implementation of the paper is a bit confusing to me.
Looking at figure 7 in the paper, I think the interpolation factor needs
to be computed between the h entries instead of between the original
samples. However I don't understand how to compute the starting point
(h(l)).
I have just started working with audio code, so I might be
misunderstanding something fundamental about the paper.
I have coded a prototype which is shared here:
https://gist.github.com/theWatchmen/746f35c349748525b412cfd9466608ce
Apologies in advance if this is not the right forum.
Marco
Ratatouille is a Neural Model loader and mixer for Linux/Windows.
![Ratatouille](https://github.com/brummer10/Ratatouille.lv2/blob/main/Ratatouille.png?raw=true)
Ratatouille allow to load up to two neural model files and mix there
output. Those models could be [*.nam files](https://tonehunt.org/all) or
[*.json or .aidax files](https://cloud.aida-x.cc/all). So you could
blend from clean to crunch for example, or, go wild and mix different
amp models, or mix a amp with a pedal simulation.
To round up the sound it allow to load up to two Impulse Response files
and mix there output as well. You could try the wildest combinations,
or, be conservative and load just your single preferred IR-File.
On the Release Page been ready to use binaries, note please that
Ratatouille.lv2-v3-v0.1-linux-x86_64.tar.xz is a fully optimized version
using the x86-64-v3 optimisation, you could check if your system
supports that by running the command
`/usr/lib64/ld-linux-x86-64.so.2 --help 2>/dev/null | grep 'x86-64-v3
(supported'`
if that return nothing, your system can't use this, in that case you
should choose Ratatouille.lv2-v0.1-linux-x86_64.tar.xz
To build from source, please use Ratatouille.lv2-v0.1-src.tar.xz as only
that contain the needed submodules.
Release Page:
https://github.com/brummer10/Ratatouille.lv2/releases/tag/v0.1
Project Page:
https://github.com/brummer10/Ratatouille.lv2