Re: [LAD] vectorization

17 Apr 2008

On Thu, 2008-04-17 at 18:36 +0200, Jens M Andreasen wrote:
...
  On Thu, 2008-04-17 at 16:14 +0200, Fernando
Lopez-Lezcano wrote:
  You mean _complete_ binaries? All of the
executable replicated several
 times with different optimizations inside the package? So your intention
 is not to optimize selected portions of the program, but _all_ of it?? 
 No, please be reasonable :-) 
I'm trying to be. I just don't understand, see below.
...
  One of the original source files holds the inner,
hopefully vectorizable
 loops that eats cpu and may or may not contain ugly kludges to get
 around the denormals problem unavoidable in cpu's before sse2.

 And place the decision logic for which to use in pre and post install
 scripts?? 
 Yes please! (pretty please?) 
Why?
Why do I, a lowly packager, have to learn all the ins and outs of
deciding which one of the modules to keep when you, as the programmer,
already know _exactly_ what needs to be done?? Just code the darn thing
and do the selection in your program!
...
  I would naively think that the package consists of
object files with say
 engine.o in several versions.
 main.o
 userinterface.o
 networking.o
 ...
 engine.o.586  # plain C, runs everywhere but probably pretty terrible
 engine.o.sse  # vectorized but has some kludges
 engine.o.sse2 # vectorized and no kludges, works for AMD, recomended!
 The pre-install script then looks in /proc/cpuinfo and decides which
 engine to rename to engine.o, links the objects in a jiffy, strips the
 binary and continues installation. 
(the process is more complicated when you take into account details like
being able to check the installed software for integrity after the
install is done, which would fail when things are moved or renamed or
erased, unless further hacks are done - an increase in complexity is
always bad unless there is a big payoff)
At this point your proposal is pretty much exactly what current programs
do, if I understand it correctly. You have routines that are specific
for each processor and logic that decides which one to use.
Differences that I can see between the approaches:
a) the decision logic is moved from the program itself to the packaging
software:
1) disadvantages:
- the packager does not know what is best unless the packager is the
programmer (I'm not the programmer so I don't know, you can count on
that).
- the approach fails to account for hardware changes that happen after
the program was installed, that is, the detection is static, not dynamic
(no solution has been proposed to this).
- I don't see any difference in performance in the resulting program,
whether the decision making is done inside the program (current
practice) or outside the program in the package management software, the
result is the same - save for the differences mentioned above.
2) advantages?
- truly, none that I can see.
  no difference in speed
  no significant difference in size
It unnecessarily complicates the packaging process with no advantage in
optimization or speed that I can see.
No, sorry, no can do.
...
    A downgrade
would not affect a current linux system, the kernel
 would just load the proper modules for the new hardware and run without
 problems. All programs I know of would adjust themselves if necessary.
   This might be because you have the least desireable versions of the
 programs?  
Nope, you (intentionally?) miss the point. It is because the kernel
picks the most desirable version at _runtime_.
As for the programs themselves... well this topic has been hashed to
death many times over in other lists like the Fedora dev lists. The
advantage on modern processors of optimizing for, say, i686, are not
that big for general purpose software. If you think they are please get
the numbers and publish them.
In Fedora (core components) there are i686 optimized packages for glibc,
openssl and... that's it. Audio programs that need it just pick a few
critical routines at runtime.
We are not, I think, talking about - for example - scientific numerical
solution software which is probably best compiled by experts for a given
architecture so that all possible cpu cycles are used.
...
  My all-singing-all-dancing kernel is at least three
times
 bigger than older kernels. This without counting any modules. And it
 takes forever to figure out that nothing new has happened since last
 reboot :-| 
Does it run _faster_ once it has booted? With exactly all the smae
modules loaded? (ie: exactly the same configuration). Probably not.
...
  Distributions like Linux-from-Scratch do things
differently. Which
 brings us back to the gunzip/configure/compile/install way of doing
 things Christian suggested from the start ...
 I must admit that I hate 'configure' myself. It is darned slow and
 checks for a lot of stuff that is senseless when you already know the
 target, and then it most likely just arrives at a decision that some
 obscure library is missing. Rinse, repeat ...
 A partially rpm based distribution could at least tell us to install KDE
 first and automagically do that before continuing to install stuff that
 depends on that environment. 
Maybe you should try Gentoo? It is all compiled from source. Everything.
And you choose stuff like flags, etc, etc.
...
   Why do you
think I, as a packager, will have access to all the possible
 hardware? Nobody does. I don't.  
 Good question. Who has the hardware and is willing to spend time on
 compiling and testing other peoples projects? Who would gain anything
 from this except for the end user? I suppose he/she then is the one to
 do the lifting, except that he/she probably won't have the guts to up
 the optimiser to insane levels, nor the experience to verify that the
 application did not break ...
 Also it is a lot of wasted time. Some kind of 'man in the middle' would
 be nice to have around, which is why people are looking at you :-) 
No one save for you is looking at me, I think.
No man in the middle is needed.
You are the man!
:-)
-- Fernando

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [LAD] vectorization