[LAD] Linux-audio-dev Digest, Vol 44, Issue 6

Niels Mayer nielsmayer at gmail.com
Thu Oct 14 15:38:29 UTC 2010


On Tue, Oct 12, 2010 at 10:09 PM, Stéphane Letz <letz at grame.fr> wrote:
> Example of CUDA used for audio.

>From ( http://old.nabble.com/sound-processing-in-GPU-w--Nvidia-CUDA---(was-Re:-fm-synthesis-software-)-p28142820.html
):
GPU processing for sound via CUDA has already been done a little bit
in the windows/mac world:
http://www.acusticaudio.net/modules.php?name=Products&file=nebula3
http://www.kvraudio.com/forum/viewtopic.php?t=222978
http://www.kvraudio.com/forum/viewtopic.php?t=240824

http://www.nvidia.com/content/GTC/posters/2010/C01-Exploring-Recognition-Network-Representations-for-Efficient-Speech-Inference-on-the-GPU.pdf

C01 - Exploring Recognition Network Representations for Efficient
Speech Inference on the GPU
We explore two contending recognition network representations for
speech inference engines: the linear lexical model (LLM) and the
weighted finite state transducer (WFST) on NVIDIA GTX285 and GTX480
GPUs. We demonstrate that while an inference engine using the simpler
LLM representation evaluates 22x more transitions per second than the
advanced WFST representation, the simple structure of the LLM
representation allows 4.7-6.4x faster evaluation and 53-65x faster
operands gathering for each state transition. We illustrate that the
performance of a speech inference engine based on the LLM
representation is competitive with the WFST representation on highly
parallel GPUs.
Author: Jike Chong (Parasians, LLC)

http://www.nvidia.com/content/GTC/posters/2010/C02-Efficient-Automatic-Speech-Recognition-on-the-GPU.pdf

C02 - Efficient Automatic Speech Recognition on the GPU
Automatic speech recognition (ASR) technology is emerging as a
critical component in data analytics for a wealth of media data being
generated everyday. ASR-based applications contain fine-grained
concurrency that has great potential to be exploited on the GPU.
However, the state-of-art ASR algorithm involves a highly parallel
graph traversal on an irregular graph with millions of states and
arcs, making efficient parallel implementations highly challenging. We
present four generalizable techniques including: dynamic data-gather
buffer, find-unique, lock-free data structures using atomics, and
hybrid global/local task queues. When used together, these techniques
can effectively resolve ASR implementation challenges on an NVIDIA
GPU.
Author: Jike Chong (Parasians, LLC)

-- Niels
http://nielsmayer.com



More information about the Linux-audio-dev mailing list