Information Technology Reference
In-Depth Information
Fig. 23. Three examples of Systolic Arrays. (a) Matrix-like SA. (b) Linear SA. (c) SA
with signals flowing in three different directions.
been introduced in [ 55 ]. SAs are a good architectural solution for new technolo-
gies (beyond-CMOS), where benefits in terms of speed and required area can
be achieved only avoiding global interconnections, and SAs verify intrinsically
this requirement. SA paradigm allows to design local small circuits (the PEs)
and replicate them to organize the whole array structure. In particular, in QCA
several circuits have been designed and simulated; as an example, in [ 56 , 57 ]SAs
for matrix multiplication and Galois field multiplication have been proposed.
Also, NML implementations for convolution filters have been proposed in [ 58 ].
While SAs help to solve the interconnection problem of this technology, they
still suffer from a loss of performance in presence of loops. Interleaving must
therefore be used in conjunction with SAs to maximize performance [ 59 ]. As far
as optimization with pipeline interleaving is concerned, SAs can be distinguished
between those that have PEs WithOut Internal Loops (WOIL), and those with
PEs With Internal Loops. The latter can then be further divided in SAs that
Store result in cells (WIL-S) and SAs that evaluate the final results passing
partial results through lines (WIL-PT). WOIL SAs are composed of PEs without
loops; for this reason, pipelining is enough to achieve the required performance
increase. Pipeline interleaving for this reason is applied to WIL SAs only.
The generic PE of a WIL SA is shown in Fig. 24 . It is composed of 4 parts:
an entry section, made of blocks numbered from 1 to
i
; the forward part of the
loop, made of blocks from
i
+1 to
j
; the feedback part of the loop, made of
blocks from
j
+1 to
k −
1; the output block, called
k
. Each of these blocks has
a delay
d n ,
n
=1
,
2
,...,k
and cannot be internally pipelined. Call
Z e the total
delay of the entry block,
Z fo the delay of the forward side of the loop,
Z fb the
delay of the feedback in the loop and
Z o the output delay:
Search WWH ::




Custom Search