On 09-20, Fons Adriaensen wrote:
On Sat, Sep 20, 2014 at 05:01:32PM -0400, Mark D.
McCurry wrote:
If you are proposing that 256 harmonics are not
needed, then is there a
transformation that yields an equivilant psycho-acoustic output in less time
than the fft would have taken given any possible spectral input?
(The user has full control over the full spectrum in terms of phase/magnitude)
I'm certainly not claiming that there is some simple trick to simplify
things.
Darn, I was hoping for you to point out some sort of interesting way of working
with pscho-acoustic space.
I have worked with it some, but I surely have large gaps in the details and
possible best ways of working with it.
But the information theory point still stands: if you
compute
256K samples and only output 256, 512 or 1024 that means that 99%
percent if the information you have is thrown away. Which probably
means it was not necessary to compute those 256K in the first place,
at least not to produce the first period of output. The only case where
this is not true is for algoritms that deliberately hide or destroy
information, i.e. cryptographic ones.
Yep, a good example of this was done a good while back when detuned sets of
identical oscillators were made to use the same wavetable.
'The user has full control over the full
spectrum'. The question is
if this is necessary - if all that detail is really perceptible in
the final output. If it is not, then there is no point in generating
it in the first place.
I'd side with it not really being needed, but going for a more restricted
interface does require careful though on how the user is going to interact with
the software.
ZynAddSubFX is a very large complex beast even just within the oscillator
generator and I'll admit that I don't have any bright ideas on how to better
express the parameter space without negatively impacting some portion of the
existing use cases.
Designing a good user experience is hard IMO and getting it right isn't going to
be quite as simple as reading up on a book or two and the fairly general nature
of some of zyn's components certainly makes nailing things down somewhat harder.
There's been a lot of research the last years into
sparse representation
of some kinds of signals and into compressive sampling, but these things
are not simple from a computational POV. And all of it is about efficiently
capturing signals, not generating them.
Yeah, that's the side of things that I've been exposed to.
Lots of compressed sensing and sparse model based representations used to
extract information out and manipulate it.
It is somewhat hard to verify if you have managed to get yourself into a bubble
and overlook nearby work which you have not directly interacted with.
The resulting
wavetable is fairly large to make the error of linear interpolation
small (as to minimize the normal running cost).
Additionally the output from traversing the wavetable can be the source for a
number of nonlinear functions (FM/PM source function and distortions).
If there weren't any nonlinear functions later in the chain, then there might be
some additional flexibility, but I don't perceive too much wiggle room without
precalculating the possible wavetables.
Also, the idea of a set critical bandwidth is broken here due to the ability to
modulate wildly without recalculating the base wavetable.
If a signal from a wavetable is used as input to non-linear processes,
or modulated wildly that means that much of the detailed spectral
information contained in the wavetable is modified in complex ways or
smeared out or just destroyed. Which again raises the question if that
information was required there in the first place.
The answer to that is that yes, most of the information in the original signal
isn't really going to be present there in a useful way, but changing the
original signal while remaining perceptually close to the distorted version is
hard, though this is more of a problem with maintaining compatibility than
anything else.
In summary, simplifying the current algorithms while
preserving the
exact output you have now may well be very difficult or impossible.
But I doubt very much that you need the exact same output.
Yes, I agree
completely.
--Mark