Hardware Reference
In-Depth Information
Jim Smith
International Symposium on Computer Architecture (1994)
Vector architectures grab sets of data elements scatered about memory, place them into
large, sequential register files, operate on data in those register files, and then disperse the
results back into memory. A single instruction operates on vectors of data, which results in
dozens of register-register operations on independent data elements.
These large register files act as compiler-controlled buffers, both to hide memory latency
and to leverage memory bandwidth. Since vector loads and stores are deeply pipelined, the
program pays the long memory latency only once per vector load or store versus once per
element, thus amortizing the latency over, say, 64 elements. Indeed, vector programs strive to
keep memory busy.
VMIPS
We begin with a vector processor consisting of the primary components that Figure 4.2
shows. This processor, which is loosely based on the Cray-1, is the foundation for discussion
throughout this section. We will call this instruction set architecture VMIPS ; its scalar portion
is MIPS, and its vector portion is the logical vector extension of MIPS. The rest of this subsec-
tion examines how the basic architecture of VMIPS relates to other processors.
Search WWH ::




Custom Search