[LAD] No nagging, a serious question

Mon Jul 5 06:07:18 UTC 2010

On Sun, Jul 4, 2010 at 8:17 PM, Paul Davis <paul at linuxaudiosystems.com> wrote:
> On Sun, Jul 4, 2010 at 4:15 PM, Ralf Mardorf <ralf.mardorf at alice-dsl.net> wrote:
>> Linux audio on RISC CPUs?
>
> just so we're clear, the modern x86 *is* a RISC architecture.

I disagree. These processors used pipelining techniques pioneered in
RISC processor designs, but have nothing to do with the concept of
"RISC"  ... especially because such pipelining and execution
optimization in CISC processors use a lot of transistors -- the very
thing being conserved with RISC.

The modern x86 adopted some of the optimizations of RISC, but they are
certainly not RISC -- by definition -- because they do not employ
Reduced Instruction Set Computing: Their instruction sets are huge and
complex, and their transistor counts are astronomical: Core i7 has 731
MILLION transistors. ( http://en.wikipedia.org/wiki/Transistor_count
).
... In fact, with Floating Point Arithmetic ops, SSX, 3DNow, SSE, etc
these architectures provide an ever-growing specialized instruction
sets (the opposite of RISC)  and transistor counts (opposite of RISC).

RISC was a stupid idea in the first place. A temporary hack based on
fabrication limitations at the time. Now that those fabrication limits
have been solved, there's no need for RISC, in general computing --
which is why Sun Microsystems was bought by Oracle for pennies on the
dollar, and why AMD and Intel dominate.

However, for embedded, apps, RISC is king because it allows you to put
the same processor subcomponent on a variety of different application
specific chips. For example, Broadcom sticks the same MIPS core in a
variety of set top box/DVR chips:
http://www.broadcom.com/products/Cable/Cable-Set-Top-Box-Solutions/BCM7125
.

http://en.wikipedia.org/wiki/X86#Current_implementations
.................
During execution, current x86 processors employ a few extra decoding
steps to split most instructions into smaller pieces
(micro-operations). These are then handed to a control unit that
buffers and schedules them in compliance with x86-semantics so that
they can be executed, partly in parallel, by one of several (more or
less specialized) execution units. These modern x86 designs are thus
superscalar, and also capable of out of order and speculative
execution (via register renaming), which means they may execute
multiple (partial or complete) x86 instructions simultaneously, and
not necessarily in the same order as given in the instruction stream.

When introduced, this approach was sometimes referred to as a "RISC
core" or as "RISC translation", partly for marketing reasons, but also
because these micro-operations share some properties with certain
types of RISC instructions. However, traditional microcode (used since
the 1950s) also inherently shares many of the same properties; the new
approach differs mainly in that the translation to micro-operations
now occurs asynchronously. Not having to synchronize the execution
units with the decode steps opens up possibilities for more analysis
of the (buffered) code stream, and therefore permits detection of
operations that can be performed in parallel, simultaneously feeding
more than one execution unit.
The latest processors also do the opposite when appropriate; they
combine certain x86 sequences (such as a compare followed by a
conditional jump) into a more complex micro-op which fits the
execution model better and thus can be executed faster or with less
machine resources involved.
..................

http://en.wikipedia.org/wiki/Reduced_instruction_set_computing#RISC_and_x86
....................
Although RISC was indeed able to scale up in performance quite quickly
and cheaply, Intel took advantage of its large market by spending vast
amounts of money on processor development. Intel could spend many
times as much as any RISC manufacturer on improving low level design
and manufacturing. The same could not be said about smaller firms like
Cyrix and NexGen, but they realized that they could apply (tightly)
pipelined design practices also to the x86-architecture, just like in
the 486 and Pentium. The 6x86 and MII series did exactly this, but was
more advanced, it implemented superscalar speculative execution via
register renaming, directly at the x86-semantic level. Others, like
the Nx586 and AMD K5 did the same, but indirectly, via dynamic
microcode buffering and semi-independent superscalar scheduling and
instruction dispatch at the micro-operation level (older or simpler
‘CISC’ designs typically executes rigid micro-operation sequences
directly). The first available chip deploying such dynamic buffering
and scheduling techniques was the NexGen Nx586, released in 1994; the
AMD K5 was severely delayed and released in 1995.
Later, more powerful processors such as Intel P6, AMD K6, AMD K7,
Pentium 4, etc employed similar dynamic buffering and scheduling
principles and implemented loosely coupled superscalar (and
speculative) execution of micro-operation sequences generated from
several parallel x86 decoding stages. Today, these ideas have been
further refined (some x86-pairs are instead merged, into a more
complex micro-operation, for example) and are still used by modern x86
processors such as Intel Core 2 and AMD K8.
...............

-- Niels
http://nielsmayer.com