Hi,
On Sat, Apr 05, 2003 at 01:08:30PM +0100, Steve Harris wrote:
There are many cases in audio software when you are
only concerned with
reading single values at a time from the fifo and relative delays, then
its much simpler [from memory, syntax might be wrong]:
unsigned int size = some_power_of_two
unsigned int write_ptr = 0
float buffer[size]
to write:
buffer[write_ptr++ & (size - 1)] = written_val
to read:
float read_val = buffer[(write_ptr - delay) & (size - 1)]
If you want catchup reads, thats:
while (read_ptr != write_ptr)
float read_val = buffer[read_ptr++ & (size - 1)]
Now make that thread-safe and esp. thread-safe on an architecture
with weak memory ordering and all the fun stuff.
If you have that all working and look at the assembly, then my
solution is actually smaller, since all your pointer updates are
required to be atomic anyway and that's what we tryed to optimize
away here.
The no braches and so on generally makes it worth
going to the next power
of two for your buffer sizes, which is why you will see so many outboards
and plugins with 2.7 second maximum delays :) 48000 * 2.7 < 2^17
Yes, your solution is a perfect DSP-chip or hardware solution. It
is even nice for ix86, if you have a not that optimizing
compiler.
But it might be worth to have an special case for these single
value reads/writes in my solution, if you don't modify your data
much after it or always refer to small amounts.
I could also imagine a model, where you do your reads within
range back and "skip" that part of the buffer later.
So sth. like:
char read_value(struct lock_free_fifo *lff, size_t delay) {
size_t avail = atomic_read(&lff->filled);
if (delay >= avail) {
/* Houston, we have a problem! Handling here depends on the
* algorithm that requests the delay. */
return FIXUP_VALUE;
}
return lff->buffer[lff->read_ptr - lff->buffer + delay - lff->max_bytes];
}
would read with delay and
void skip_bytes(struct lock_free_fifo *lff, size_t bytes) {
size_t avail = atomic_read(&lff->filled);
if (bytes > avail)
bytes = avail;
lff->read_ptr += bytes;
if (lff->read_ptr >= lff->end_ptr)
lff->read_ptr = lff->end_ptr;
atomic_sub(&lff->filled, bytes);
}
would implement the skipping. The if's would degenerate to
conditional moves, if CPU and compiler support that.
The logic might be switched from direct memory adressing to
indirect memory adressing. This is esp. benificial, if you don't
work with chars, but with larger data types.
With the above routines, you still minimize locking overhead in
SMP and can use delaying filters without problems.
NOTE: It's still important, that ALL readers are synchronised
with each other (the above ARE readers) and ALL writers
synchronised to each other. Just no reader need to be
synchronised with a writer, as this fifo does it already.
NOTE2: My solution also allows to PHYSICALLY decouple the reader
and the writer. I have used this scheme for Host->DSP and
DSP->Host communication with 1 FIFO per direction. They just
synchronize on "filled" variable by telling each other how
much they have read/written.
For normal computers that means, that you can safely store
your read_ptr one one thread stack and the write_ptr on
another. Then you can copy the constant data (pointer to
memory area used for buffering and fifo size) to have them
in the right cachelines.
So just the "filled" variable is there to be the
synchronisation point and is the only thing, that uses
buslocks and does cachline-ping-pong. The last two are
increasing factors on multiprocessor systems. P4 with
hyperthreading is the same here.
Regards
Ingo Oeser
--
Marketing ist die Kunst, Leuten Sachen zu verkaufen, die sie
nicht brauchen, mit Geld, was sie nicht haben, um Leute zu
beeindrucken, die sie nicht moegen.