I'm potentially in the market for a new box before end of year. Has anyone
found anything to show whether Athlon64 (or the high-end FX line) are worth
it for audio? Assume some legacy audio code under legacy OS :-/
Tim
Benno Sennoner and I were discussing today on IRC about
the usual fixed point vs floating point (regarding to some resampling code)
We developed some tests and ran them on a variety of computers.
It would be interesting if ladders here could run them on different computers
(and specially non x86, like amd64. or the Gx processors) so we can
see what performance can we expect on each , and how things
seem to be shaping for the future. It will also be a key factor
on how our projects will develop in the future.
The code is available at:
http://reduz.dyndns.org/resamp_fixp.c // fixed point version
http://reduz.dyndns.org/resamp_float.c // floating point version, portable
http://reduz.dyndns.org/resamp_float_fistl.c // X86 VERSION ONLY!! Uses fistl
instruction
The results dont mean just int vs float performance. They
also test for float->int conversion, which is common
in most algorithms that work with buffers. It is a linearly
interpolated resampling with volume control.
please use GCC options -
-O3 -ffast-math -march=<yourcpu> so it's a fair comparison
results from other compilers is also appreciated!
Here's some results for reference
****************************************************************
vendor_id : AuthenticAMD
model name : AMD-K6(tm) 3D processor
cpu MHz : 412.508
cache size : 64 KB
flags : fpu vme de pse tsc msr mce cx8 pge mmx syscall 3dnow k6_mtrr
bogomips : 822.47
resamp_fixp - 0m8.460s
resamp_float_fistl - 0m27.390s
****************************************************************
vendor_id : AuthenticAMD
model name : AMD Duron(tm) Processor
cpu MHz : 951.701
cache size : 64 KB
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat
pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips : 1900.54
resamp_float - 0m11.180s
resamp_float_fistl - 0m5.810s
resamp_fixp_optimized - 0m2.790s
************************************************
Benno gave me some results:
Intel p4 1800 celeron
resamp_float_fistl - 4.00user
resamp_fixp - 4.51user
(float faster?)
--------------------------------
VIA nehemiah 1GHz
resamp_float_fistl - 0m21.079s
resamp_fixp - 0m7.129s
*************************
Some results on SPARC:
2x 125MHz HyperSPARC
resamp_fixp - user 2m59.010s
resamp_float - user 0m44.290s
(float faster! :)
****************************
Intel 2.4 GHZ P4:
resamp_float - 0m3.440s
resamp_float_fistl - 0m2.960s
resamp_fixp - 0m1.450s
************************
1.25GHz G4 (laptop)
resamp_float - 0m11.170s
resamp_fixp - 0m2.130s
-----------------------------------
Conclusions SO FAR.
My own conclusions about the subject is that the float -> int conversion is
STILL the biggest bottleneck on most common architectures. And until this is
sorted out, fixed point is still the best solution for some specific cases,
and I dont see any problem mixing it with floating point code. If you look
at that algorithm closely in the source, you could replace "counter/
increment"
for purely fixed point values, and do the rest (managing the samples) in
float. This will undoubly speed it up..
Cheers!
Juan Linietsky
On Thu, 27 Nov 2003, Martijn Sipkema wrote:
> [...]
> > So how is the low-latency situation for 2.6? I did install 2.6 on
> > my private machine, but was not able to get better performance
> > than 2.4 with ll+pre (kicked out of jack-graph pretty soon with 128
> > frames period). Is there a trick to get better lowlatency performance with
> > 2.6 I don't know about?
>
> Did you compile with CONFIG_PREEMPT enabled?
>
Yes. :)
--
Have you seen this:
http://gsd.ime.usp.br/~lago/masters/
It is a networked LADSPA plugin. Maybe it could be made into a jack system.
What would be nice is to have something like your net client on one computer,
and a network jackdriver running the jackgraph on a different computer.
You do get one period extra latency between the two graphs.
G
--
electronic & acoustic musics-- http://www.xs4all.nl/~gml
Hi,
i have this silly idea of implementing a very simple jack client.
It would do nothing else, but offer a number of writable and readable
ports [lets assume 2 of each for now]. Now, in the background, this
jack-client opens a number of UDP sockets [me is not a network expert,
so bear with me here :)]. 2 UDP ports on which it listens to incoming
data, which it then feeds into the readable ports.
Any data that comes in on its writable ports will be made available as
simple. It opens 2 UDP sockets, specifies a target adress and sends out
the audio data.
The mechanism is basically completely connection-less. packets that
belong to the same stream just come from the same ip-adress, maybe
carrying an index number to denote the port number.
This way it would be vey easy to stream data from one jack graph to
another on a different computer. Latencies will not be considered at
all. These connections must be considered to be async. So a send to
another computer, there feeding into an effect and then sending back
will not be really a "realtime" thing [hmm, well, depends on the network
speed and how big then network buffers are, etc.. i assume both on a
private local lan here]
Also, there probably must be some basic/minimal network protocol be
implemented. At least a sequential numbering or timestamp, so the
receiver can maintain a buffer and sort packets that come in out of
order..
Again i'm no expert on this. But it sounds failry simple.
What do you think? This could of course be done arbitrarily complex
[with timing stuff, connection handling, authentification, etc], but i
look for something dead simple.
Flo
--
music: http://www.soundclick.com/bands/9/florianschmidt.htm
Paul Davis:
> >Since mainstream capabilities support seems always to be somewhere
> >over the horizon, I am interested in the patch Paul and Steve
> >mentioned. IIUC, it defines a control file in /proc which, if
> >enabled, allows any process access to scheduling and memory locking
> >privileges. No other capabilities are provided. I would love to see
> >a copy of this patch to study exactly what it does.
>
> its a very simple patch, IIRC. it just short-circuits the checks on
> uid==0 and/or capabilities when assigning SCHED_FIFO and/or locking
> memory.
>
> i'm looking for it in my archives. i'm a bit worried i may have
I couldn't wait til you found it, so I wrote one from scratch instead. :)
The url below point to a hackish patch againt 2.4.23-rc1, and yes, it is
very simple. Works by setting /proc/sys/kernel/setschedandmlock to 1.
http://www.notam02.no/arkiv/src/schedmlockpatch-2.4.23-rc1
--
Specimen is a midi controlled sampler, and it's very first public
release is available from http://www.gazuga.net/specimen.tar.gz
In all honesty, I advise against downloading this in it's current
state unless you intend to hack on it or play with it for sheer
amusement value only. There are a few TODO items that must be taken
care of before this can be considered usable software (as the next
release will be). The main reason for this release is because I said
I was going to make a release in a couple of weeks 15 days ago.
It's very nascent stuff.
You have been warned.
[pb]
7 pages with written by Albert Veli
Take a look at the front page :-)
http://www.datormagazin.se/ssjs/aktuellt_nr.html
It mentions, with varying levels of detail:
* Alsa (three pages)
* GNOME ALSA Mixer (with screen dump?
- it look old and boring compared to KMix in KDE 3.2 IMHO)
* Jack
* Qjackctl (with screen capture?)
* Fluidsynt
* Timidity++
* LADSPA
* Jack-rack
* Ecamegapedal (with screen capture)
* Rosegarden-4 (with screen capture)
* ReZound (with screen capture)
* Muse
* Audacity
* Ardour ["problematic to compile, steep learning curve" the article
suggests looking at www.djcj.org/LAU/quicktoots
* Ecasound
* AlsaModularSynth
* Legasynth
* SpiralSynth
* Pd
* Soundtracker, cheesetracker
* Alsaplayer
* xmms
/RogerL
--
Roger Larsson
Skellefteå
Sweden
>From: Benno Senoner <sbenno(a)gardena.net>
>
>the method I use to communicate between the GUIs is SYSV message queues
>because they can be multiclient but the API is abstracted so the
>underlying transport model can be chosen arbitrarily.
I have written and I'm writing a message system where client sends,
e.g., the following msg to the server:
read <file id> <loc> <len> <addr> <return value>
which reads a block of audio from a file to a (shared) memory
location.
I did not see the return value in your system (and don't figure
out how one goes without it). The server sends both the result
and the given return value back to the client. The return value
could be something which only the client knows, e.g., array
index, memory address, text. Something which associates the
result with the request.
The implementation is too complex to my taste as the messages
are combination of raw byte data and string data. Each message
starts with <int> which tells the length of the message. I have
written a set of routines to assemble and disassemble these
messages. I send these messages via pipes/sockets, via shared
memory, and whatever.
In any case, such message system should be general enough,
and the return value brings it to that direction.
Hmm... what do you mean by multiclient? Is the writing to the
message queue an "atomic" operation for long messages?
I use one pipe/socket per client to one server because I suspect
only the server can make the long messages "atomic" by interleaving
them correctly. Likewise, I use one lock-free shared-memory FIFO
per client to one server.
Regards,
Juhana
Have you seen this?
http://faac.sourceforge.net/
I was browsing sourceforge, and found this, it might be of interest to some
people on this list.
-Richard