Part 2 of (maybe) 3
For those with lack of familiarity with Fourier analysis and
synthesis, here is a concrete example to demonstrate potentially
serious problems with sinc resamplers in doing bulk conversions at
constant rates. These problems are real and could easily result in
audible artifacts --- something that I assume is of importance
to Linux audio users --- and especially with further processing.
------------------------
File 123163main_cas-skr1-112203.wav is the NASA file recently
mentioned on LAU --- a public-domain, taxpayer-supported WAV file
sampled at 5000 samples per second. This file was chosen arbitrarily
--- just happened to resample it before reverbing and posting for
interested LAU'ers a while ago, so decided to use it for a
comparison for Steve Harris.
Two resamplers:
1) sinc resampler:
$ sndfile-resample -to 44100 -c 0 123163main_cas-skr1-112203.wav \
saturn_sndfile-resample.wav
2) FFT with large windows:
Sampster in Mixster (stuff I wrote myself)
Comparison was every 50th sample in the original file with every 441st
sample in the other two (should match exactly every 0.01 seconds) for
the first 9.5 seconds of the files. 9.5 seconds was chosen rather
arbitrarily --- nothing special about it. Ideally these particular
samples should match exactly. Any error indicates corruption of the
original data at the exact locations where the original samples were
taken. The last two columns show you the difference between what is
expected at these matching points and what was actually obtained after
resampling. Note that the values in the last column are significantly
greater than those in the next-to-last column.
Match# Original FFT sndfile # FFT sndfile (diffs)
1: 0.00000 -2.00000 19.0000 1: -2 19
2: 386.000 384.000 437.000 2: -2 51
3: -181.000 -183.000 -178.000 3: -2 3
4: -500.000 -502.000 -538.000 4: -2 -38
5: -1065.00 -1067.00 -1068.00 5: -2 -3
6: -54.0000 -56.0000 -28.0000 6: -2 26
7: -120.000 -122.000 -55.0000 7: -2 65
8: -348.000 -350.000 -344.000 8: -2 4
9: 827.000 825.000 805.000 9: -2 -22
<snip>
344: -67.0000 -71.0000 100.000 344: -4 167
345: -378.000 -382.000 -275.000 345: -4 103
346: -37.0000 -41.0000 -101.000 346: -4 -64
347: -209.000 -213.000 -19.0000 347: -4 190
348: 269.000 265.000 86.0000 348: -4 -183
349: 62.0000 58.0000 27.0000 349: -4 -35
350: 427.000 423.000 446.000 350: -4 19
351: 154.000 150.000 -47.0000 351: -4 -201
352: 619.000 615.000 52.0000 352: -4 -567
353: -202.000 -206.000 111.000 353: -4 313
354: -366.000 -370.000 205.000 354: -4 571 <<<
OUCH! Hope this doesn't get expanded. Over 100x larger error.
355: -146.000 -150.000 8.00000 355: -4 154
356: 549.000 545.000 558.000 356: -4 9
357: 279.000 275.000 -34.0000 357: -4 -313
358: -110.000 -114.000 -12.0000 358: -4 98
359: -184.000 -188.000 199.000 359: -4 383
360: -215.000 -219.000 -417.000 360: -4 -202
361: 244.000 240.000 74.0000 361: -4 -170
362: -474.000 -478.000 -152.000 362: -4 322
363: 188.000 184.000 562.000 363: -4 374
<snip>
938: -1448.00 -1449.00 -1468.00 938: -1 -20
939: -1203.00 -1204.00 -1161.00 939: -1 42
940: 3210.00 3209.00 3111.00 940: -1 -99 <<< about
100x larger error at 10% full scale
941: 5767.00 5766.00 5838.00 941: -1 71
942: -656.000 -657.000 -628.000 942: -1 28
943: -5165.00 -5166.00 -5163.00 943: -1 2
944: 1547.00 1546.00 1584.00 944: -1 37
945: 4410.00 4409.00 4445.00 945: -1 35
946: 1912.00 1911.00 1881.00 946: -1 -31
947: 5947.00 5946.00 5829.00 947: -1 -118 <<< Over
100x larger error at 18% full scale.
948: 5923.00 5922.00 5902.00 948: -1 -21
949: 3462.00 3461.00 3494.00 949: -1 32
What this shows is that at every 0.01 seconds, where the original file
and the resampled file should have the same exact value (if the
original data were preserved), large errors occur for sndfile-
resample.
------------------------
resample-1.7 was even worse with a phase shift on top of this type of
inaccuracy, coupled with rather serious spectral leakage beyond 2.5
kHz which was the original band limit (or might as well be assumed to
have been). Upon examining the waveforms, I could see that resample-1.7
was doing an excellent job of tracing out the original waveform by
drawing pretty much straight lines between points. Although visually
reassuring, this actually adds spectral components that were not
in the original. So it depends on what you want. This resampling
probably won't sound like the original, but does look good in an
editor.
------------------------
Also of interest is that the very latest version of sndfile-resample
gives slightly different results than an earlier version for the
locations which should match (the versions are for libsamplerate):
Match# v 0.0.15 v 0.1.2
1: 19.0000 18.0000
2: 437.0000 436.0000
3: -178.0000 -179.0000
4: -538.0000 -539.0000
5: -1068.0000 -1068.0000
6: -28.0000 -28.0000
7: -55.0000 -55.0000
8: -344.0000 -344.0000
9: 805.0000 805.0000
10: -81.0000 -82.0000
11: 482.0000 482.0000
12: 78.0000 77.0000
13: 227.0000 227.0000
14: 501.0000 500.0000
15: 13.0000 12.0000
<snip>
So the *amount* of corruption of the original data at locations which
should match varies with version! Fortunately (or perhaps unfortunately
depending upon your point of view) this latest version never varies
more than 2 from the earlier version, so the "latest and greatest" is
just as bad. The errors in the table above would be altered by 2 or
less, which is insignificant.