Hardware Reference
In-Depth Information
■ How do you program a vector computer? Architectural innovations that are a mismatch to
compiler technology may not get widespread use.
The rest of this section introduces each of these optimizations of the vector architecture, and
Appendix G goes into greater depth.
Multiple Lanes: Beyond One Element Per Clock Cycle
A critical advantage of a vector instruction set is that it allows software to pass a large amount
of parallel work to hardware using only a single short instruction. A single vector instruction
can include scores of independent operations yet be encoded in the same number of bits as a
conventional scalar instruction. The parallel semantics of a vector instruction allow an imple-
mentation to execute these elemental operations using a deeply pipelined functional unit, as
in the VMIPS implementation we've studied so far; an array of parallel functional units; or a
combination of parallel and pipelined functional units. Figure 4.4 illustrates how to improve
vector performance by using parallel pipelines to execute a vector add instruction.
Search WWH ::




Custom Search