does anyone here know if splitting code across
different files, or for
that matter, reordering the layout of one source file so that
functions called together are now "far apart" can actually affect
execution speed?
Not on x86, at least since we have a flat address space. I don't know about
other architectures, of course, but I suspect not.
The exception is, of course, cache. Closer things will tend to both be in
cache, or neither in cache (up to cache-lines). Probably matters more for
data than code.