Hardware Reference
In-Depth Information
and graphical data more efficiently. These instructions are oriented to the highly parallel and often
repetitive sequences frequently found in multimedia operations. Highly parallel refers to the fact that
the same processing is done on many data points, such as when modifying a graphic image. The main
drawbacks to MMX were that it worked only on integer values and used the floating-point unit for
processing, so time was lost when a shift to floating-point operations was necessary. These
drawbacks were corrected in the additions to MMX from Intel and AMD.
Intel licensed the MMX capabilities to competitors such as AMD and Cyrix (later absorbed by VIA),
who were then able to upgrade their own Intel-compatible processors with MMX technology.
SSE
In February 1999, Intel introduced the Pentium III processor and included in that processor an update
to MMX called Streaming SIMD Extensions (SSE). These were also called Katmai New
Instructions (KNI) up until their debut because they were originally included on the Katmai
processor, which was the code name for the Pentium III. The Celeron 533A and faster Celeron
processors based on the Pentium III core also support SSE instructions. The earlier Pentium II and
Celeron 533 and lower (based on the Pentium II core) do not support SSE.
The Streaming SIMD Extensions consist of 70 new instructions, including SIMD floating point,
additional SIMD integer, and cacheability control instructions. Some of the technologies that benefit
from the Streaming SIMD Extensions include advanced imaging, 3D video, streaming audio and
video (DVD playback), and speech-recognition applications.
The SSE x instructions are particularly useful with MPEG2 decoding, which is the standard scheme
used on DVD video discs. Therefore, SSE-equipped processors should be more capable of
performing MPEG2 decoding in software at full speed without requiring an additional hardware
MPEG2 decoder card. SSE-equipped processors are also much better and faster than previous
processors when it comes to speech recognition.
One of the main benefits of SSE over plain MMX is that it supports single-precision floating-point
SIMD operations, which have posed a bottleneck in the 3D graphics processing. Just as with plain
MMX, SIMD enables multiple operations to be performed per processor instruction. Specifically,
SSE supports up to four floating-point operations per cycle; that is, a single instruction can operate on
four pieces of data simultaneously. SSE floating-point instructions can be mixed with MMX
instructions with no performance penalties. SSE also supports data prefetching , which is a
mechanism for reading data into the cache before it is actually called for.
SSE includes 70 new instructions for graphics and sound processing over what MMX provided. SSE
is similar to MMX; in fact, besides being called KNI, SSE was called MMX-2 by some before it was
released. In addition to adding more MMX-style instructions, the SSE instructions allow for floating-
point calculations and now use a separate unit within the processor instead of sharing the standard
floating-point unit as MMX did.
SSE2 was introduced in November 2000, along with the Pentium 4 processor, and adds 144
additional SIMD instructions. SSE2 also includes all the previous MMX and SSE instructions.
SSE3 was introduced in February 2004, along with the Pentium 4 Prescott processor, and adds 13
new SIMD instructions to improve complex math, graphics, video encoding, and thread
synchronization. SSE3 also includes all the previous MMX, SSE, and SSE2 instructions.
SSSE3 (Supplemental SSE3) was introduced in June 2006 in the Xeon 5100 series server
processors, and in July 2006 in the Core 2 processors. SSSE3 adds 32 new SIMD instructions to
 
Search WWH ::




Custom Search