On Wednesday 28 November 2007, Dave Robillard wrote:
[...]
Obviously, this effect is less visible with non-trivial plugins -
but even how many plugins in a "normal" graph are actually CPU
bound? In my experience, the effect is very pronounced (several
times higher CPU load with 512 samples/buffer than with 32) when
using a bunch of sampleplayer voices (cubic interpolation +
oversampling) and some simple effects. You have to do some pretty
serious processing to go CPU bound on a modern PC...
Not really true for linear scans through memory, which reading
plugin buffers usually is. This is the ideal cache situation, and
you can definitely get full utilization sweeping through things much
larger than the cache this way (this is exactly what the cache is
designed to do, and it does it well; try it).
I did, and that's what I'm drawing my conclusions from. ;-)
No cache controller logic in the world can avoid the bottleneck that
is memory bandwidth. If your data has been kicked from the cache, it
needs to get back in, and the only time you're not taking a hit from
that is when you're doing sequential access *and* your code is CPU
bound, rather than memory bound.
Well, sure, but big data is big data. In the typical case plugin
buffers are much smaller than the cache (especially on recent CPUs) -
crunching away on plain old audio here is definitely CPU bound (with
properly written RT safe host+plugins anyway).
No matter how you slice it if you want to play with a massive chunk of
data you're bound by the memory bandwidth*, sure. So what? :)
-DR-
(* until everything is parallel like it really should be, and this
problem goes away...)