Image Processing Reference
In-Depth Information
u
j
F
1 kk
ˆ
u
b
RFS ( )
j
1 kk
Fig. 1. VLSI-FPGA platform of the RSF/RASF algorithms via the HW/SW co-design
paradigm.
The basic algebraic matrix operation (i.e., the selected matrix-vector multiplication) that
constitutes the base of the most computationally consuming applications in the
reconstructive SP applications is transformed into the required parallel algorithmic
representation format. A manifold of different approaches can be used to represent parallel
algorithms, e.g. (Moldovan & Fortes, 1986), (Kung, 1988). In this study, we consider a
number of different loop optimization techniques used in high performance computing
(HPC) in order to exploit the maximum possible parallelism in the design:
- Loop unrolling,
- Nested loop optimization,
- Loop interchange.
In addition, to achieve such maximum possible parallelism in an algorithm, the so-called
data dependencies in the computations must be analyzed (Moldovan & Fortes, 1986), (Kung,
1988). Formally, these dependencies are to be expressed via the corresponding dependence
graph (DG). Following (Kung, 1988), we define the dependence graph G =[ P , E ] as a
composite set where P represents the nodes and E represents the arcs or edges in which each
e E connects
  . Next, the data dependencies
analysis of the matrix-vector multiplication algorithms should be performed aimed at their
efficient parallelization.
For example, the matrix-vector multiplication of an n × m matrix A with a vector x of
dimension
pp P that is represented as
ep
p
12
1
2
m,
given
by
y = Ax ,
can
be
algorithmically
computed
as
n
, where y and
a
represents an n- dimensional ( n - D ) output
y
a x
,
for
j
1,...,
m
j
ji
i
ji
i
1
vector and the corresponding element of A , respectively. The first SW-level transformation
is the so-called single assignment algorithm (Kung, 1988), (Castillo Atoche et al., 2010b) that
performs the computing of the matrix-vector product. Such single assignment algorithm
corresponds to a loop unrolling method in which the primary benefit in loop unrolling is to
Search WWH ::




Custom Search