Information Technology Reference
In-Depth Information
are fed to the circuit. Data from different sequences are completely independent.
The aim is to calculate (Eqs. 1 and 2 )
i =0 A ∗ B
3
(1)
i =0 C ∗ D
3
(2)
M0
N0
a0
b0
c0
d0
Multiplier
Multiplier
S0
O0
ADD
ADD
?
?
S0
S0
O0
?
A) T=0
B) T = 26 ck
M1
N1
a1
b1
c1
d1
Multiplier
Multiplier
S1
O1
ADD
ADD
S0
O0
S1
S1
O1
O0
C) T = 52 ck
D) T = 78 ck
Fig. 14. Example of interleaving applied to a MAC unit. Two input sequences are sent
to the circuit, both send one value every 52 clock cycles, with 26 clock cycles between
data of different sequences. The left column shows the calculation of sequence A, B,
while the right one represents the calculation of sequence C, D. As can be observed,
the two sequences are executed in parallel but they do not interfere with each other.
(A) a0 and b0 are sent to the circuit. (B) After 26 clock cycles c0 and d0 are sent
to the circuit. (C) At a time correspondent to 52 clock cycles a1 and b1 are sent to
the circuit and they reach the adder input exactly with S0 , the result of the previous
operation. (D) At 78 clock cycles c1 and d1 are sent to the circuit.
At the beginning a0 and b0 , the first two data of the first sequence are sent
to the circuit (Fig. 14 (A)). Just for this example, to better clarify the interleav-
ing principle, the multiplier is considered ideal without delay, so data propa-
gate directly from the general MAC inputs to the adder inputs. After a time
equal to half the loop length (26 clock cycles in this case), c0 and d0 are sent
to the inputs (Fig. 14 (B)). This operation is correct because there is no data
dependency between them and a0 , b0 . At the 52nd clock cycle, a1 and b1 are
then sent to the circuit and they arrive at the adder inputs together with S0
(Fig. 14 (C)), the result of the previous operation. After other 26 clock cycles c1
Search WWH ::




Custom Search