Image Processing Reference
In-Depth Information
 
for the vector schedule,  
are employed:
Π
11
d
10
for the projection vector and,
 
Σ
01
for the space processor, respectively. With these specifications the transformation
  
Π
11
01
matrix becomes
T
. Now, for a simplified test-case, we specify the following
  
  
Σ
operational parameters: m = n = 4, the period of clock of 10 ns and 32 bits data-word length.
Now, we are ready to derive the specialized bit-level matrix-format MPPAs-based
architecture. Each processor of the vector-matrix PA is next derived in an array of
processing elements (PEs) at bit-level scale. Once again, the space-time transformation is
employed to design the bit-level architecture of each processor unit of the matrix-vector PA.
The following specifications were considered for the bit-level multiply-accumulate
architecture:
 
for the vector schedule,  
Π
12
d
10
for the projection vector and,
 
Σ
01
for the space processor, respectively. With these specifications the transformation
  
Π
12
01
matrix becomes
T
. The specified operational parameters are the following:
  
  
Σ
l= 32 (i.e., which represents the dimension of the word-length) and the period of clock of 10
ns. The developed architecture is next illustrated in Fig. 2.
From the analysis of Fig. 2, one can deduce that with the MPPA approach, the real time
implementation of computationally complex RS operations can be achieved due the highly-
pipelined MPPA structure.
3.2 Bit-level design based on MPPAS of the high-speed VLSI accelerator
As described above, the proposed partitioning of the VLSI-FPGA platform considers the
design and fabrication of a low-power high-speed co-processor integrated circuit for the
implementation of complex matrix-vector SP operation. Fig. 3 shows the Full Adder (FA)
circuit that was constantly used through all the design.
An extensive design analysis was carried out in bit-level matrix-format of the MPPAs-based
architecture and the achieved hardware was studied comprehensively. In order to generate
an efficient architecture for the application, various issues were taken into account. The
main one considered was to reduce the gate count, because it determines the number of
transistors (i.e., silicon area) to be used for the development of the VLSI accelerator. Power
consumption is also determined by it to some extent. The design has also to be scalable to
other technologies. The VLSI co-processor integrated circuit was designed using a Low-
Power Standard Cell library in a 0.6µm double-poly triple-metal (DPTM) CMOS process
using the Tanner Tools® software. Each logic cell from the library is designed at a transistor
level. Additionally, S-Edit® was used for the schematic capture of the integrated circuit
using a hierarchical approach and the layout was automatically done through the Standard
Cell Place and Route (SPR) utility of L-Edit from Tanner Tools®.
4. Performance analysis
4.1 Metrics
In the evaluation of the proposed VLSI˗FPGA architectue, it is considered a conventional
side-looking synthethic aperture radar (SAR) with the fractionally synthesized aperture as
an RS imaging system (Shlvarko et al., 2008), (Wehner, 1994). The regular SFO of such SAR
Search WWH ::




Custom Search