Information Technology Reference
In-Depth Information
Fig. 15. Simulation comparison between the original circuit behavior (A) and the cir-
cuit with data interleaving (B). The only modification done is mixing input sequence
1 and input sequence 2. As can be observed, the calculation results are the same, only
are also mixed, but the total simulation time is significantly reduced.
and d1 are sent to the circuit. They arrive at the adder inputs together with
O1 (Fig. 14 (D)). Data from the same sequence are always sent every 52 clock
cycles granting perfect signals synchronization, but two sequences are executed
in the same time required to execute one sequence alone, effectively doubling the
throughput.
Figure 15 shows a MAC simulation with and without interleaving. The
sequences, originally executed serially as A, B and then C, D (Fig. 15 (A)), are
parallelized and interleaved (Fig. 15 (B)). The original time interval between two
consecutive input values (52 clock cycles) is halved. Therefore, the total execu-
tion time of these two independent operations is halved, and the throughput has
been doubled. This principle can be expanded executing 52 operations in par-
allel, sending therefore effectively one data every clock cycle. The throughput
is therefore maximized as in a pure combinational circuit. This clearly demon-
strates that interleaving is the perfect technique to be adopted in case of NML
(and QCA) circuits. To fully exploit the potential of this technology, it requires
a large number of independent data sequences to process in parallel. Only appli-
cations where a large number of data to process is available are therefore best
adapted to this technology.
Architecture Redesign for Loops Length Reduction. In addition to algo-
rithm rearrangement techniques like interleaving, it is also possible to modify
architectures with the aim of reducing the loops length. Since loops are the
Search WWH ::




Custom Search