* Lee Revell <rlrevell(a)joe-job.com> wrote:
Does this scale in a linear fashion with respect to
CPU speed? You
mentioned you were testing on a 2Ghz machine, does 700 usecs on that
translate to 2800 usecs on a 500Mhz box?
given that this particular latency is dominated by cachemisses the DRAM
latency controls it too which is independent of CPU MHz. Wrt. cachemiss
costs, newer CPUs typically have twice the L2-cache line size (so it
takes more bus cycles to fetch it) - the improvements in bandwidth of
fetching a single line should offset most of this. DRAM latencies didnt
improve much in the past 10 years so that's almost a constant between a
500MHz/100MHz(system-bus) vs. 2GHz/400MHz system. So i'd guesstimate a
500 MHz box to do somewhere around 1000-1500 usecs.
How much I/O do you allow to be in flight at once? It
seems like by
decreasing the maximum size of I/O that you handle in one interrupt
you could improve this quite a bit. Disk throughput is good enough,
anyone in the real world who would feel a 10% hit would just throw
hardware at the problem.
i'm not sure whether this particular value (max # of sg-entries per IO
op) is runtime tunable. Jens? Might make sense to enable elvtune-alike
tunability of this value.
limiting the in-flight IO will only work with IDE/PATA that doesnt have
multiple commands in flight for a given disk. SATA and SCSI handles
multiple command completions per IRQ invocation so limiting the size of
a single IO op has less effect there.
Ingo