[LAD] Fw: Re: Some questions about the Jack callback

Mark D. McCurry mark.d.mccurry at gmail.com
Sat Sep 20 22:21:02 UTC 2014


On 09-20, Fons Adriaensen wrote:
> On Sat, Sep 20, 2014 at 05:01:32PM -0400, Mark D. McCurry wrote:
> 
> > If you are proposing that 256 harmonics are not needed, then is there a
> > transformation that yields an equivilant psycho-acoustic output in less time
> > than the fft would have taken given any possible spectral input?
> > (The user has full control over the full spectrum in terms of phase/magnitude)
> 
> I'm certainly not claiming that there is some simple trick to simplify
> things.

Darn, I was hoping for you to point out some sort of interesting way of working
with pscho-acoustic space.
I have worked with it some, but I surely have large gaps in the details and
possible best ways of working with it.

> But the information theory point still stands: if you compute
> 256K samples and only output 256, 512 or 1024 that means that 99% 
> percent if the information you have is thrown away. Which probably
> means it was not necessary to compute those 256K in the first place,
> at least not to produce the first period of output. The only case where
> this is not true is for algoritms that deliberately hide or destroy
> information, i.e. cryptographic ones.

Yep, a good example of this was done a good while back when detuned sets of
identical oscillators were made to use the same wavetable.

> 'The user has full control over the full spectrum'. The question is
> if this is necessary - if all that detail is really perceptible in
> the final output. If it is not, then there is no point in generating
> it in the first place.

I'd side with it not really being needed, but going for a more restricted
interface does require careful though on how the user is going to interact with
the software.
ZynAddSubFX is a very large complex beast even just within the oscillator
generator and I'll admit that I don't have any bright ideas on how to better
express the parameter space without negatively impacting some portion of the
existing use cases.
Designing a good user experience is hard IMO and getting it right isn't going to
be quite as simple as reading up on a book or two and the fairly general nature
of some of zyn's components certainly makes nailing things down somewhat harder.

> There's been a lot of research the last years into sparse representation
> of some kinds of signals and into compressive sampling, but these things
> are not simple from a computational POV. And all of it is about efficiently
> capturing signals, not generating them.

Yeah, that's the side of things that I've been exposed to.
Lots of compressed sensing and sparse model based representations used to
extract information out and manipulate it.
It is somewhat hard to verify if you have managed to get yourself into a bubble
and overlook nearby work which you have not directly interacted with.


> > The resulting wavetable is fairly large to make the error of linear interpolation
> > small (as to minimize the normal running cost).
> > Additionally the output from traversing the wavetable can be the source for a
> > number of nonlinear functions (FM/PM source function and distortions).
> > If there weren't any nonlinear functions later in the chain, then there might be
> > some additional flexibility, but I don't perceive too much wiggle room without
> > precalculating the possible wavetables.
> > Also, the idea of a set critical bandwidth is broken here due to the ability to
> > modulate wildly without recalculating the base wavetable.
> 
> If a signal from a wavetable is used as input to non-linear processes,
> or modulated wildly that means that much of the detailed spectral
> information contained in the wavetable is modified in complex ways or
> smeared out or just destroyed. Which again raises the question if that
> information was required there in the first place.

The answer to that is that yes, most of the information in the original signal
isn't really going to be present there in a useful way, but changing the
original signal while remaining perceptually close to the distorted version is
hard, though this is more of a problem with maintaining compatibility than
anything else.

> In summary, simplifying the current algorithms while preserving the
> exact output you have now may well be very difficult or impossible.
> But I doubt very much that you need the exact same output.
Yes, I agree completely.

--Mark
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.linuxaudio.org/pipermail/linux-audio-dev/attachments/20140920/eead8a82/attachment.pgp>


More information about the Linux-audio-dev mailing list