A more efficient way to detect INF and/or NAN in a block of samples

List overview All Threads
Download

newer

older

io GNU/Linux... new iso uploded :)

[ANN] Qtractor 0.5.11 - The Lima...

Kjetil Matheussen

6 Oct 2013 6 Oct '13

1:34 p.m.

I want to detect INFs and NANs in my DSP graph to avoid having them spread and cause various trouble. Here is the straight forward way: int i; for (i=0;i<num_samples;i++) if (!isfinite(samples[i])) break if(i!=num_samples) error(); But is this as efficient as we get it? I'm wondering if comparing samples using for instance SIMD instructions, for instance, could make it around 4 times faster, Something like this: for(i=0;i<num_samples;i++) if(samples[i]!=samples[i])) break; where the samples[i]!=samples[i] test would succeed if it was a nan or inf, since INFs and NANs don't behave normally. I don't think this particular example works though (?), but perhaps something similar could? Anyone doing something like this?

Show replies by date

Kjetil Matheussen

6 Oct 6 Oct

3:35 p.m.

New subject: A more efficient way to detect INF and/or NAN in a block of samples

...

I don't think this particular example works though (?), but perhaps something similar could?

Guess it would work to add all elements in the array, and see if the result is inf or nan. That operation sounds likely to be automatically vectorized by the c compiler...

Robin Gareus

5:01 p.m.

New subject: A more efficient way to detect INF and/or NAN in a block of samples

On 10/06/2013 01:34 PM, Kjetil Matheussen wrote:

...

Probably. Unless you start to dive into CPU specifics and wander off to assembly.

...

I'm wondering if comparing samples using for instance SIMD instructions, for instance, could make it around 4 times faster, Something like this: for(i=0;i<num_samples;i++) if(samples[i]!=samples[i])) break; where the samples[i]!=samples[i] test would succeed

You probably already know that, but be careful with this when using optimizations with this comparison. -ffast-math may void IEEE compat.

...

if it was a nan or inf, since INFs and NANs don't behave normally. I don't think this particular example works though (?), but perhaps something similar could? Anyone doing something like this?

In my case it's not only about detecting, but also flushing them to zero. Depending on what is appropriate for DSP at hand (meters.lv2) I settled on using math.h's isnan(), !isfinite() or simply adding something (to minus infinite). While that's probably not optimal it is portable and architecture independent, and I don't notice any significant DSP load caused by it. BTW gcc does not vectorize the isnan/isfinite loop that you've posted: "control flow in loop" (i386, gcc 4.7.2). No dice when replacing the break statement in the loop with v |= isfinite(); either. But it unrolls the loop at least. Another idea: add the signal (using SSE). If one of the summands is NaN, the result will be Nan. -- that should effectively take less CPU (assuming that NaN is non the common case). brainstormingly yours, robin

Kjetil Matheussen

9:07 p.m.

New subject: A more efficient way to detect INF and/or NAN in a block of samples

On Sun, Oct 6, 2013 at 5:01 PM, Robin Gareus <robin(a)gareus.org> wrote:

...

You probably already know that, but be careful with this when using optimizations with this comparison. -ffast-math may void IEEE compat.

I didn't know that. Thank you.

...

if it was a nan or inf, since INFs and NANs don't behave normally. I don't think this particular example works though (?), but perhaps something similar could? Anyone doing something like this?

Yeah, it would probably make a lot more sense time-vice to spend my time making the DSP graph run on several CPUs, rather than on this optimization.

...

BTW gcc does not vectorize the isnan/isfinite loop that you've posted: "control flow in loop" (i386, gcc 4.7.2). No dice when replacing the break statement in the loop with v |= isfinite(); either. But it unrolls the loop at least. Another idea: add the signal (using SSE). If one of the summands is NaN, the result will be Nan. -- that should effectively take less CPU (assuming that NaN is non the common case).

Thank you, guess there are some opportunities there. To be honest though, I hoped that someone had some read-made code I could use. :-) But brainstorming further, it probably works to combine the peak finding routine (which is run on all signals) with the nan/inf-detection: static float RT_get_max_val(float *array, int num_elements){ float ret=0.0f; float minus_ret = 0.0f; for(int i=0;i<num_elements;i++){ float val = array[i]; if(val>ret){ ret=val; minus_ret = -val; }else if (val<minus_ret){ ret = -val; minus_ret = val; } } // NAN/INF code here: if(!isfinite(ret)) error(); return ret; }

Robin Gareus

9:31 p.m.

New subject: A more efficient way to detect INF and/or NAN in a block of samples

On 10/06/2013 09:07 PM, Kjetil Matheussen wrote:

...

But brainstorming further, it probably works to combine the peak finding routine (which is run on all signals) with the nan/inf-detection:

+1 BTW compile with -ftree-vectorizer-verbose=7 to check what gcc does. If you're looking for something ready-made, there are SSE, and altivec asm routines for mixing buffers and calculating the peak value in ardour. Have a look at libs/ardour/*.s, libs/ardour/mix.cc and libs/ardour/ardour/mix.h GPL and everything, thanks to Sampo IIRC. ciao, robin