[LAD] How do you improve optimization for integer calculations?

Nikita Zlobin cook60020tmp at mail.ru
Wed Apr 10 11:56:02 CEST 2019


In Sun, 7 Apr 2019 23:06:45 +0500
Nikita Zlobin <cook60020tmp at mail.ru> wrote:

> I really did not recognize that nasty trick, clearing xmm0 :).
> Also i understood, why SSE can't be used there. Without integer
> division support it is undoable with SSE - replacing with
> multiplication means conversion to float.
>

I recently discovered fast integer division algorythm, allowing to
accelerate multiple divisions with same divisor. I got working this
way, but then discovered that gcc uses this method, so it is still
doable by SSE. Though from other side, i still can't find enough
places, where benefit of working with colors as single integers rather
than separate color values would be meaningful... one such place is
accumulator, used for averaging. While input is uint8_t[4], accumulator
is uint16_t[4]. I have to either work with them by elements or use
masks, bitshifts and OR for each element... just to prepare single
value and store (either uing32_t[2] or just one uint64_t).

Looks like benchmarks are necessary, along with these intrinsics, to
test, wether integer SSE really better than what gcc proposes.


More information about the Linux-audio-dev mailing list