On Wed, 2009-08-12 at 00:32 +1000, Loki Davison wrote:
Could you run convolution algo's i.e jconv stuff
on the card via cuda?
That's a very good question! One user vvolkov from Berkeley has posted
some specially designed and optimized code for FFT sizes 256, 512, 1024,
2048, 4096 and 8192 here:
http://forums.nvidia.com/index.php?showtopic=69801
He is claiming a throughput of no less than 160 Gflop on a GTX 280
(which is quite a high end card.)
How much computational power is needed for jconv, just to get a rough
estimate of whether this would be a good idea?
Would it be correct to say that the expected number of flops for 8192
samples would be close to 8 × (8192/2) × log(8192) == approx 128000
flops, most of which can be executed as combined multiply-adds?
Another user posted sometime ago windows binaries for a convolution
reverb using the CUDA FFT library routines. It is a) unlike jconv using
large same size segments b) perhaps not the most optimal solution for
CUDA
But as proof of concept, yes people have said that it works.