On Mon, 2009-05-11 at 10:23 -0400, Paul Davis wrote:
1) the question is now how to fit a single set of N
samples into cache
memory. Its how to fit *all* the samples to be processed in a given
"cycle" into cache memory. Wasting 25% of cache memory for each buffer
isn't conducive to this.
If 96 frames are enough for stability (and say 64 isn't), then sample 96
- 127 in a 128 frame buffer are a waste of memory anyway and only adds
to latency.
It may even be so that a set of shorter buffers that are only partially
aligned - but allocated as one continous area - may have a greater
chance of fitting into available cache, without trashing other important
data.