[LAD] vectorization

Fernando Lopez-Lezcano nando at ccrma.Stanford.EDU
Thu Apr 17 22:22:56 UTC 2008


On Thu, 2008-04-17 at 18:36 +0200, Jens M Andreasen wrote:
> On Thu, 2008-04-17 at 16:14 +0200, Fernando Lopez-Lezcano wrote:
> > You mean _complete_ binaries? All of the executable replicated several
> > times with different optimizations inside the package? So your intention
> > is not to optimize selected portions of the program, but _all_ of it??
> 
> No, please be reasonable :-)

I'm trying to be. I just don't understand, see below. 

> One of the original source files holds the inner, hopefully vectorizable
> loops that eats cpu and may or may not contain ugly kludges to get
> around the denormals problem unavoidable in cpu's before sse2.
> > 
> > And place the decision logic for which to use in pre and post install
> > scripts??
> 
> Yes please! (pretty please?)

Why?

Why do I, a lowly packager, have to learn all the ins and outs of
deciding which one of the modules to keep when you, as the programmer,
already know _exactly_ what needs to be done?? Just code the darn thing
and do the selection in your program!

> I would naively think that the package consists of object files with say
> engine.o in several versions. 
> 
> main.o
> userinterface.o
> networking.o
> ...
> engine.o.586  # plain C, runs everywhere but probably pretty terrible
> engine.o.sse  # vectorized but has some kludges 
> engine.o.sse2 # vectorized and no kludges, works for AMD, recomended!
> 
> The pre-install script then looks in /proc/cpuinfo and decides which
> engine to rename to engine.o, links the objects in a jiffy, strips the
> binary and continues installation.

(the process is more complicated when you take into account details like
being able to check the installed software for integrity after the
install is done, which would fail when things are moved or renamed or
erased, unless further hacks are done - an increase in complexity is
always bad unless there is a big payoff)

At this point your proposal is pretty much exactly what current programs
do, if I understand it correctly. You have routines that are specific
for each processor and logic that decides which one to use. 

Differences that I can see between the approaches:

a) the decision logic is moved from the program itself to the packaging
software:

1) disadvantages: 

- the packager does not know what is best unless the packager is the
programmer (I'm not the programmer so I don't know, you can count on
that). 

- the approach fails to account for hardware changes that happen after
the program was installed, that is, the detection is static, not dynamic
(no solution has been proposed to this). 

- I don't see any difference in performance in the resulting program,
whether the decision making is done inside the program (current
practice) or outside the program in the package management software, the
result is the same - save for the differences mentioned above. 

2) advantages?

- truly, none that I can see. 
  no difference in speed
  no significant difference in size

It unnecessarily complicates the packaging process with no advantage in
optimization or speed that I can see. 

No, sorry, no can do. 

> >  A downgrade would not affect a current linux system, the kernel
> > would just load the proper modules for the new hardware and run without
> > problems. All programs I know of would adjust themselves if necessary. 
> > 
> This might be because you have the least desireable versions of the
> programs? 

Nope, you (intentionally?) miss the point. It is because the kernel
picks the most desirable version at _runtime_. 

As for the programs themselves... well this topic has been hashed to
death many times over in other lists like the Fedora dev lists. The
advantage on modern processors of optimizing for, say, i686, are not
that big for general purpose software. If you think they are please get
the numbers and publish them. 

In Fedora (core components) there are i686 optimized packages for glibc,
openssl and... that's it. Audio programs that need it just pick a few
critical routines at runtime. 

We are not, I think, talking about - for example - scientific numerical
solution software which is probably best compiled by experts for a given
architecture so that all possible cpu cycles are used. 

> My all-singing-all-dancing kernel is at least three times
> bigger than older kernels. This without counting any modules. And it
> takes forever to figure out that nothing new has happened since last
> reboot :-|

Does it run _faster_ once it has booted? With exactly all the smae
modules loaded? (ie: exactly the same configuration). Probably not. 

> Distributions like Linux-from-Scratch do things differently. Which
> brings us back to the gunzip/configure/compile/install way of doing
> things Christian suggested from the start ...
> 
> I must admit that I hate 'configure' myself. It is darned slow and
> checks for a lot of stuff that is senseless when you already know the
> target, and then it most likely just arrives at a decision that some
> obscure library is missing. Rinse, repeat ...
> 
> A partially rpm based distribution could at least tell us to install KDE
> first and automagically do that before continuing to install stuff that
> depends on that environment.

Maybe you should try Gentoo? It is all compiled from source. Everything.
And you choose stuff like flags, etc, etc. 

> > Why do you think I, as a packager, will have access to all the possible
> > hardware? Nobody does. I don't. 
> 
> Good question. Who has the hardware and is willing to spend time on
> compiling and testing other peoples projects? Who would gain anything
> from this except for the end user? I suppose he/she then is the one to
> do the lifting, except that he/she probably won't have the guts to up
> the optimiser to insane levels, nor the experience to verify that the
> application did not break ... 
> 
> Also it is a lot of wasted time. Some kind of 'man in the middle' would
> be nice to have around, which is why people are looking at you :-)

No one save for you is looking at me, I think. 
No man in the middle is needed.
You are the man!
:-)
-- Fernando





More information about the Linux-audio-dev mailing list