Hardware Reference
In-Depth Information
The TriMedia VLIW CPU
We studied one example of a VLIW CPU, the Itanium-2, in Chap. 5. Let us
now look at a very different VLIW processor, the TriMedia , designed by Philips,
the Dutch electronics company that also invented the audio CD and CD-ROM.
The TriMedia is intended for use as an embedded processor in image-, audio-, and
video-intensive applications such as CD, DVD, and MP3 players, CD and DVD
recorders, interactive TV sets, digital cameras, camcorders, and so on. Given these
application areas, it is not surprising that it differs considerably from the Itanium-2,
which is a general-purpose CPU intended for high-end servers.
The TriMedia is a true VLIW processor with every instruction holding as many
as five operations . Under completely optimal conditions, every clock cycle one
instruction is started and the five operations are issued. The clock runs at 266 MHz
or 300 MHz, but since five operations per cycle can be issued, the effective clock
speed is as much as five times higher. In the discussion below, we will focus on
the TM3260 implementation of the TriMedia; other versions differ in minor ways
from it.
A typical instruction is illustrated in Fig. 8-3. The instructions vary from stan-
dard 8-, 16-, and 32-bit integer instructions through IEEE 754 floating-point in-
structions to parallel multimedia instructions. As a consequence of the five issues
per cycle and the parallel multimedia instructions, the TriMedia is fast enough to
decode streaming DV from a camcorder at full size and full frame rate in software.
Slot 1
Slot 2
Slot 3
Slot 4
Operation in slot 5
Addition
Shift
Multimedia
Load
Store
Instruction
Figure 8-3. A typical TriMedia instruction, showing five possible operations.
The TriMedia has a byte-oriented memory, with the I/O registers mapped into
the memory space. Half words (16 bits) and full words (32 bits) must be aligned
on their natural boundaries. It can run either as little endian or big endian, depend-
ing on a PSW bit that the operating system can set. This bit affects only the way
load operations and store operations transfer between memory and registers. The
CPU contains a split 8-way set-associative cache, with a 64-byte line size for both
the instruction cache and the data cache. The instruction cache is 64 KB; the data
cache is 16 KB.
There are 128 general-purpose 32-bit registers. Register R0 is hardwired to 0.
Register R1 is hardwired to 1. Attempting to change either one gives the CPU a
heart attack. The remaining 126 registers are all functionally equivalent and can be
used for any purpose. In addition, four special-purpose, 32-bit registers also exist.
 
Search WWH ::




Custom Search